Only by taking infinitesimally small units for observation (the differential of history, that is, the individual tendencies of men) and attaining to the art of integrating them (that is, finding the sum of these infinitesimals) can we hope to arrive at the laws of history.
Leo Tolstoy (1828-1910) War and Peace
38.1 Integration
This chapter is going to focus on integrals, which were originally formulated in calculus as a type of sum. In fact, the integral symbol int\int, introduced by Leibniz, is a 'long s' to indicate the Latin word summa (or fumma in old-fashioned writing) meaning sum. The fundamental theorem of calculus states that differentiation and integration are the inverses of each other, so that
This equation has a rather intriguing feature: we are integrating the derivative df//dx\mathrm{d} f / \mathrm{d} x over a region (a <= x <= b)(a \leq x \leq b) and the answer depends only on the function ff evaluated at the boundary of the region. Moreover, there is an interesting property of the signs on the right-hand side of eqn 38.1: f(b)f(b) is included with a +sign+\operatorname{sign}, but f(a)f(a) is included with a sign. You can see where this comes from by thinking of the integral as a sum, slicing up the interval, so that we could write
{:(38.2)int_(a)^(b)((d)f)/((d)x)dx~~Delta xsum_(i=1)^(N-1)(f(x_(i+1))-f(x_(i)))/(Delta x):}\begin{equation*}
\int_{a}^{b} \frac{\mathrm{~d} f}{\mathrm{~d} x} \mathrm{~d} x \approx \Delta x \sum_{i=1}^{N-1} \frac{f\left(x_{i+1}\right)-f\left(x_{i}\right)}{\Delta x} \tag{38.2}
\end{equation*}
where Delta x\Delta x is the width of each slice, so that adjacent terms cancel and all that is left is f(x_(N))-f(x_(1))-=f(b)-f(a)f\left(x_{N}\right)-f\left(x_{1}\right) \equiv f(b)-f(a), the difference between the value of ff at the boundary of the region of integration.
This result can be generalized to three dimensions, so that ^(1){ }^{1}
{:(38.3)int vec(grad)fdV=int fd vec(S):}\begin{equation*}
\int \vec{\nabla} f \mathrm{~d} V=\int f \mathrm{~d} \vec{S} \tag{38.3}
\end{equation*}
38.1 Integration
38.2 Integrating over forms 400 38.3 Anatomy of an integral 401 38.4 Boundaries and chains 404 38.5 Stokes' theorem 405 Chapter summary 408 Exercises 408 ↷\curvearrowright This Chapter represents the logical endpoint for the material in Part VV of the book. However, it lies outside most of our remaining topics, so can be skipped on a first reading. Stokes' theorem is used in Chapter 43. ^(1){ }^{1} In components, eqn 38.3 becomes
^(2){ }^{2} Sir George G. S. Stokes, 1st Baronet (1919-1903) made numerous, important contributions to mathematical physics in the 19th century. He was the first person to simultaneously be Lu casian Professor of Mathematics, President of the Royal Society, and Member of Parliament for Cambridge University. Newton had also held all three posts, but not at the same time.
where the equivalent of the sign change seen on the right-hand side of eqn 38.1 is accomplished by the surface vector d vec(S)\mathrm{d} \vec{S} changing orientation on either side of the surface. Then putting f=a^(j)f=a^{j} (where j=x,yj=x, y or zz ) into eqn 38.3 and then summing over jj yields
which is the divergence theorem, familiar from vector calculus (and used extensively in electromagnetism and fluid dynamics). This famous result preserves the intuition we have gained from eqn 38.1, namely that the integral of the derivative of a quantity in the interior of a space is equal to some kind of sum of that quantity on the boundary of that space.
We leapt from one dimension to three dimensions. What about the result in the middle? This is a little more complicated and a twodimensional version of this argument yields the result that
{:(38.5)int vec(grad)f xxd vec(S)=-ointfd vec(ℓ)",":}\begin{equation*}
\int \vec{\nabla} f \times \mathrm{d} \vec{S}=-\oint f \mathrm{~d} \vec{\ell}, \tag{38.5}
\end{equation*}
where the right-hand side is an integral round a contour. This equation follows from the idea that all the gradients of ff inside the surface sum to nothing, leaving only a circulating term around the boundary. The ii th component of this equation can be written out as epsi_(ikm)intdel^(k)fdS^(m)=\varepsilon_{i k m} \int \partial^{k} f \mathrm{~d} S^{m}=-ointfdℓ_(i)-\oint f \mathrm{~d} \ell_{i}, so that substituting in f=a^(j)f=a^{j} and then setting i=ji=j yields Stokes' theorem ^(2){ }^{2}
This equation is also familiar from vector calculus, electromagnetism, and fluid mechanics, but for the purposes of this chapter it is important to notice something about the structure of this equation. It is complicated by the vector product symbol, but once again we have emerged with the intuition that an integral of some kind of gradient of a function inside the interior of a space is related to the integral of the function on the boundary. We are left with the feeling that these various results (eqns 38.1, 38.4, and 38.6) are expressing the same underlying truth, but they all look a bit different and it is hard to see how to unify them.
Can we cut through this dense thicket of different theorems for different dimensionalities? It turns out that we can, using the mathematical apparatus that we have been developing in the last few chapters. Where we will end up is the following generalized Stokes' theorem:
{:(38.7)int_(C)d tilde(Omega)=int_(del C) tilde(Omega):}\begin{equation*}
\int_{C} d \tilde{\boldsymbol{\Omega}}=\int_{\partial C} \tilde{\boldsymbol{\Omega}} \tag{38.7}
\end{equation*}
This is an integral over a space CC of the exterior derivative of an nn-form tilde(Omega)\tilde{\boldsymbol{\Omega}} is equal to the integral over the boundary of a space CC, written as del C\partial C, of the nn-form itself. Notice that it expresses exactly the intuition we have gleaned from our previous examples that were specific to particular dimensions. This new equation works in all directions and has buried
inside it all the other vector theorems we have just discussed (and with which we suspect most readers will be familiar).
To make progress, and to reach our generalized Stokes' theorem (eqn 38.7), we will need to upgrade our notation. Consider a typical line integral, such as the one used to compute the work done by a force F\boldsymbol{F} in a three-dimensional Euclidean space expressed in Cartesian coordinates, ^(3){ }^{3}
This involves the integral of something over a one-dimensional path CC. From our experience in the previous chapters, we recognize that the 'something' in this case resembles a 1 -form, which can be rewritten in geometric notation as tilde(alpha)=F_(x)dx+F_(y)dy+F_(z)dz\tilde{\boldsymbol{\alpha}}=F_{x} \boldsymbol{d} x+F_{y} \boldsymbol{d} y+F_{z} \boldsymbol{d} z. We shall write this integral as
Admittedly, it looks as if we are making a terrible faux pas in writing an integral without including a term d (something), violating a rule that will have been drummed into most of us by successive teachers of mathematics. Nevertheless, this notation is correct as we understand this to be part of the differential form, as we will discuss below.
Next, consider a surface integral. Recalling from Chapters 32 and 37 that areas can be written as bivectors, we can write a typical surface integral of a function as
{:(38.10)I_(2)=int_(S)G*dS=int_(S)(G_(x)dy^^dz+G_(y)dz^^dx+G_(z)dx^^dy):}\begin{equation*}
I_{2}=\int_{S} \boldsymbol{G} \cdot \mathrm{~d} \boldsymbol{S}=\int_{S}\left(G_{x} \boldsymbol{d} y \wedge \boldsymbol{d} z+G_{y} \boldsymbol{d} z \wedge \boldsymbol{d} x+G_{z} \boldsymbol{d} x \wedge \boldsymbol{d} y\right) \tag{38.10}
\end{equation*}
where the elements of the surface SS have been recast using the wedge product. Written in this way, the integral involves the integral of a 2-form tilde(beta)=G_(x)dy^^dz+G_(y)dz^^dx+G_(z)dx^^dy\tilde{\boldsymbol{\beta}}=G_{x} \boldsymbol{d} y \wedge \boldsymbol{d} z+G_{y} \boldsymbol{d} z \wedge \boldsymbol{d} x+G_{z} \boldsymbol{d} x \wedge \boldsymbol{d} y over a surface SS :
In the remainder of this chapter, we generalize these observations and see how all multiple integrals can be understood as an integral of a form. We will start by investigating how this might work and why forms are the natural content of the integrand of an integral. We then go on to investigate the surfaces over which we integrate: these are members of a family of objects called chains that are dual to forms, which is to say we have an inner product-like relationship
{:(38.12){:(:" form "," chain ":)"="((" multiple ")/(" integral "))=" (number ").:}\begin{equation*}
\left.\langle\text { form }, \text { chain }\rangle "="\binom{\text { multiple }}{\text { integral }}=\text { (number }\right) . \tag{38.12}
\end{equation*}
The rest of this chapter discusses this idea, leading eventually to our generalized version of Stokes' theorem, which is one of the milestones of differential geometry. ^(3){ }^{3} We shall restrict our attention to flat space and spacetime in this chapter.
38.2 Integrating over forms
A 0 -form is a function. A 1-form looks like a series of planes. A 2-form looks like a set of crossed surfaces forming a series of parallel tubes. A 3 -form looks like an array of cells. These geometric objects can be used to provide the basic integrand of (function) xx\times (integration cells) over which to integrate.
A definite integral outputs a number. A multiple integral is carried out over a generalized surface: a one-dimensional line, two-dimensional surface, three-dimensional volume or even a higher dimensional surface. Geometrically, it is the combination of a differential form and a surface that gives rise to the number. Specifically, the line integral evaluates the number of times the line crosses the surfaces of the 1-form. That is, for a 1 -form Y\boldsymbol{Y}, we have
{:(38.13)int_("line ") tilde(Y)=((" number of ")/(" surfaces cut ")):}\begin{equation*}
\int_{\text {line }} \tilde{\boldsymbol{Y}}=\binom{\text { number of }}{\text { surfaces cut }} \tag{38.13}
\end{equation*}
In the same way, the surface integral counts the number of tubes of the 2 -form that cut the 2 -surface. That is, for a 2 -form tilde(X)\tilde{\boldsymbol{X}}, we have
{:(38.14)int_("surface ") tilde(X)=((" number of ")/(" tubes cut ")):}\begin{equation*}
\int_{\text {surface }} \tilde{\boldsymbol{X}}=\binom{\text { number of }}{\text { tubes cut }} \tag{38.14}
\end{equation*}
The volume integral evaluates the number of cells of the 3 -form contained by a 3 -volume, or, for a 3 -form tilde(W)\tilde{\boldsymbol{W}}, we have
{:(38.15)int_("volume ") tilde(W)=((" number of ")/(" cells contained ")):}\begin{equation*}
\int_{\text {volume }} \tilde{\boldsymbol{W}}=\binom{\text { number of }}{\text { cells contained }} \tag{38.15}
\end{equation*}
If we accept that integrals are indeed all carried out over forms, then we have a neat explanation for a potentially puzzling aspect of multiple integrals: the existence of the Jacobian. ^(4){ }^{4} Recall how multiple integration is often introduced. We split a surface up into an array of infinitesimal cells and evaluate a function over the surface by summing its value on each of the cells. If we use a Cartesian coordinate system, a typical volume integral then looks like
{:(38.16)int_(V)f(x","y","z)dxdydz:}\begin{equation*}
\int_{V} f(x, y, z) \mathrm{d} x \mathrm{~d} y \mathrm{~d} z \tag{38.16}
\end{equation*}
However, when we generalize to different coordinate systems, we need to include a factor of the Jacobian determinant JJ, which guarantees that the cells defined by the new coordinates correctly mesh, allowing the volume to be covered without any gaps. Given two coordinate systems, with coordinates (x^(0),dots,x^(n))\left(x^{0}, \ldots, x^{n}\right) and (y^(0),dotsy^(n))\left(y^{0}, \ldots y^{n}\right), the Jacobian determinant is defined as
The determinant is an antisymmetric object and can be formed using ^(5){ }^{5} the antisymmetric symbol epsi_(mu alpha beta gamma)\varepsilon_{\mu \alpha \beta \gamma}.
So how does the Jacobian factor arise? Let's investigate what happens to an element of surface, written as a 2 -form dalpha^(1)xx dalpha^(2)\boldsymbol{d} \alpha^{1} \times \boldsymbol{d} \alpha^{2}, when a change of coordinates is made.
Example 38.1
Suppose we have a mesh given in terms of coordinates alpha^(1)\alpha^{1} and alpha^(2)\alpha^{2} and we want to write it in terms of coordinates x^(1)x^{1} and x^(2)x^{2}. We write the basis 1 -forms in terms of the new coordinates as follows:
Where we recognize the Jacobian determinant JJ in the final line.
We conclude that the Jacobian determinant is generated naturally by the algebra of forms.
38.3 Anatomy of an integral
We saw in the last section how a surface and a form are combined to give a number. This is reminiscent of how a 1 -form tilde(alpha)\tilde{\boldsymbol{\alpha}} and its dual, a vector v\boldsymbol{v}, are combined to make a number. This, of course, is done using an inner product (: tilde(alpha),v:)\langle\tilde{\boldsymbol{\alpha}}, \boldsymbol{v}\rangle. By examining the anatomy of a multiple integral, we shall see how a form can be combined in an inner product with its dual in this context: the generalized surface. This inner product gives us our integral.
A general multiple integral can be written as ^(6){ }^{6}
^(6){ }^{6} We label coordinates here starting from 1 (rather than zero, as we have been doing for (3+1)-dimensional spacetime). ^(5){ }^{5} Specifically, we can also write JJ as a
sum J=epsi_(i_(0)i_(1)dotsi_(n))(delx^(i_(0)))/(dely^(0))(delx^(i_(1)))/(dely^(1))cdots(delx^(i_(n)))/(dely^(n))J=\varepsilon_{i_{0} i_{1} \ldots i_{n}} \frac{\partial x^{i_{0}}}{\partial y^{0}} \frac{\partial x^{i_{1}}}{\partial y^{1}} \cdots \frac{\partial x^{i_{n}}}{\partial y^{n}}. (38.18)
This says that there's a surface S\mathcal{S} parametrized by some variables lambda^(i)\lambda^{i}. We integrate a function f(x^(1),dots,x^(n))f\left(x^{1}, \ldots, x^{n}\right) by summing the function multiplied by a surface element dx^(1)dotsdx^(n)\mathrm{d} x^{1} \ldots \mathrm{~d} x^{n} written in the function's coordinates. The
Jacobian guarantees that the surface elements (given in terms of x^(i)x^{i} ) mesh to cover the surface (given in terms of lambda^(i)\lambda^{i} ). The key points of this section are that:
The function and surface element together can be represented as an nn-form tilde(alpha)\tilde{\boldsymbol{\alpha}};
The surface can be represented as a nn-vector S\boldsymbol{S};
The integrand, complete with Jacobian, can be recreated by taking the inner product of the nn-form tilde(alpha)\tilde{\boldsymbol{\alpha}} and nn-vector S\boldsymbol{S};
Geometrically, the integral is an inner product that counts the number of nn-cells representing tilde(alpha)\tilde{\boldsymbol{\alpha}}, contained in the nn-parallelepiped that represents SS.
Let's break this up into parts. We first look at the function and surface element that is contained in the form tilde(alpha)\tilde{\boldsymbol{\alpha}}, which looks like
This clearly falls apart into (i) a function and (ii) a surface element built from forms. The form in the surface element has the necessary properties to naturally give the Jacobian matrix required for our choice of coordinates. In the next example, we remind ourselves how to represent a 4 -volume and 3 -volume in terms of forms.
Example 38.2
Working in (3+1)-dimensional flat space with coordinates (t,x,y,z)(t, x, y, z), we can interpret the 4 -volume element as a volume 4 -form ^(7){ }^{7}
{:[ tilde(omega)(","","",")=(1)/(4!)*epsi_(mu nu alpha beta)dx^(mu)^^dx^(nu)^^dx^(alpha)^^dx^(beta)],[=epsi_(|mu nu alpha beta|)dx^(mu)^^dx^(nu)^^dx^(alpha)^^dx^(beta)],[=epsi_(0123)dt^^dx^^dy^^dz],[(38.23)=dt^^dx^^dy^^dz]:}\begin{align*}
\tilde{\boldsymbol{\omega}}(,,,) & =\frac{1}{4!} \cdot \varepsilon_{\mu \nu \alpha \beta} \boldsymbol{d} x^{\mu} \wedge \boldsymbol{d} x^{\nu} \wedge \boldsymbol{d} x^{\alpha} \wedge \boldsymbol{d} x^{\beta} \\
& =\varepsilon_{|\mu \nu \alpha \beta|} \boldsymbol{d} x^{\mu} \wedge \boldsymbol{d} x^{\nu} \wedge \boldsymbol{d} x^{\alpha} \wedge \boldsymbol{d} x^{\beta} \\
& =\varepsilon_{0123} \boldsymbol{d} t \wedge \boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z \\
& =\boldsymbol{d} t \wedge \boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z \tag{38.23}
\end{align*}
^(8){ }^{8} Recall that this notation forces alpha < quad\alpha<\quad where we've used the |alpha beta gamma delta||\alpha \beta \gamma \delta| notation. ^(8){ }^{8} (One message here is that we can leave the beta < gamma < delta\beta<\gamma<\delta. ^(9){ }^{9} The 3-form d tilde(Sigma)_(mu)\mathrm{d} \tilde{\boldsymbol{\Sigma}}_{\mu} is dual to a basis vector e_(mu)\boldsymbol{e}_{\mu}, i.e. d tilde(Sigma)_(mu)=***e_(mu)\mathrm{d} \tilde{\boldsymbol{\Sigma}}_{\mu}=\star \boldsymbol{e}_{\mu}. If we fill the three slots of the tensor d tilde(Sigma)_(mu)\mathrm{d} \tilde{\boldsymbol{\Sigma}}_{\mu}, we obtain three slots of the tensor dSigma_(mu)\mathrm{d} \Sigma_{\mu}, we obtain
the components of the 3 -volume 1-form d tilde(sigma)(e_(mu))\mathrm{d} \tilde{\boldsymbol{\sigma}}\left(\boldsymbol{e}_{\mu}\right) from the last chapter.
where we ve used the alpha beta\alpha \beta 倍 sum unrestricted and include the factor 4 !, or restrict the sum.)
In the same way, the 3 -volume element can be provided by a 3 -form ^(9){ }^{9}
In many applications, we shall integrate 3-form objects written as J^(mu)d tilde(Sigma)_(mu)J^{\mu} \mathrm{d} \tilde{\boldsymbol{\Sigma}}_{\mu} over some surface. In terms of forms, this is to be interpreted as
With the integrand identified as a form, we now attempt to describe the generalized surface as an nn-vector. This is possible since, just as nn-forms are built from wedge products of 1-forms, the generalized surface over
which we integrate can be built from wedge products of vectors. Let's consider doing a general multiple integral
where the form tilde(alpha)\tilde{\boldsymbol{\alpha}} is integrated over the nn-dimensional surface S\mathcal{S}. To see how to construct an element of generalized nn-dimensional surface S\mathcal{S}, consider the vicinity of a points P\mathcal{P} on a surface P(lambda^(1),lambda^(2),dots,lambda^(n))\mathcal{P}\left(\lambda^{1}, \lambda^{2}, \ldots, \lambda^{n}\right), where lambda^(i)\lambda^{i} are the coordinates parametrizing the surface. A small displacement in the lambda^(i)\lambda^{i} direction is written as (delP//dellambda^(i))Deltalambda^(i)\left(\partial \mathcal{P} / \partial \lambda^{i}\right) \Delta \lambda^{i}, which is formed from a vector delP//dellambda^(i)\partial \mathcal{P} / \partial \lambda^{i} and infinitesimal length element Deltalambda^(i)\Delta \lambda^{i}. In building a surface, we can take the wedge products of the vector parts to form a generalized nn-dimensional parallelepiped. Therefore, tangent to the surface S\mathcal{S} at point P\mathcal{P} is an infinitesimal parallelepiped built from vectors
To perform the integral, we evaluate how many infinitesimal cells represented by the integration form tilde(alpha)\tilde{\boldsymbol{\alpha}} are cut by the surface. Recall that the machine that works this out is the inner product, ^(10){ }^{10} since this tells us how a nn-form and nn-vector mesh together. ^(11){ }^{11} We therefore interpret the integral symbol as follows:
Let's see how this generates the Jacobian. Consider a basis forms dx^(1),dots,dx^(n)\boldsymbol{d} x^{1}, \ldots, \boldsymbol{d} x^{n} such that the form is written as
We check this gives the expected answer for the simplest possible case in the next example.
Example 38.3
Consider a line integral int_("curve ")df\int_{\text {curve }} \boldsymbol{d} f. This can be done by inspection: the answer is simply the difference in the values of the function ff at the start and end of the curve. ^(12){ }^{12} We parametrize the curve with 0 <= lambda <= 10 \leq \lambda \leq 1 and we have
tells us to evaluate how much many tubes of tilde(F)\tilde{\boldsymbol{F}} are cut by the bivector u^^v\boldsymbol{u} \wedge \boldsymbol{v}. Looking ahead, we put the material in this section to work in Chapter 43 by identifying an infinitesimal surface Delta xe_(x)^^Delta ye_(y)\Delta x e_{x} \wedge \Delta y e_{y} and evaluating the integral
{:[int_(Delta xe_(x)^^Delta ye_(y)) tilde(F)],[=int(:( tilde(F)),e_(x)^^e_(y):)Delta x Delta y],[=int tilde(F)(e_(x),e_(y))Delta x Delta y],[=intF_(xy)Delta x Delta y.]:}\begin{aligned}
& \int_{\Delta x \boldsymbol{e}_{x} \wedge \Delta y \boldsymbol{e}_{y}} \tilde{\boldsymbol{F}} \\
&=\int\left\langle\tilde{\boldsymbol{F}}, \boldsymbol{e}_{x} \wedge \boldsymbol{e}_{y}\right\rangle \Delta x \Delta y \\
&=\int \tilde{\boldsymbol{F}}\left(\boldsymbol{e}_{x}, \boldsymbol{e}_{y}\right) \Delta x \Delta y \\
&=\int F_{x y} \Delta x \Delta y .
\end{aligned}
See Chapter 43 for the details. ^(11){ }^{11} Remember the rule that (: tilde(alpha),v:)=\langle\tilde{\boldsymbol{\alpha}}, \boldsymbol{v}\rangle=alpha_(|i_(1)dotsi_(n)|)v^(i_(1)dotsi_(n))\alpha_{\left|i_{1} \ldots i_{n}\right|} v^{i_{1} \ldots i_{n}}. ^(12){ }^{12} This integral is independent of the path taken by the curve, showing that the form df\boldsymbol{d} f corresponds to a conservative force field. We can generalize that the integral of an exact 1 -form dfd f is path independent, and given by the change in the function ff along the curve.
Fig. 38.1 Boundaries for several of ob jects. In some cases, for spaces with no boundary, we generate the empty set (shown schematically as the space vanishing 'in a puff of logic', in Douglas Adams' memorable phrase). This figure illustrates some of the examples given in Example 38.4 ^(13){ }^{13} The concept of chains comes from algebraic topology, where the chain group is an algebraic structure composed by constructing linear combinations of 'simplicial complexes', the idea being that a topological space can be constructed by gluing together little triangles or tetrahedra, or other highdimensional basic objects (the simpli cial complexes). For more details, see e.g. Nakahara (1990), but this algebraic treatment is not important for our purposes here and we can just think of chains simply as topological spaces.
We recover the expected result.
We conclude that an integral is an inner product between a form and a surface. Since it outputs a number, the surfaces must be dual to the forms. In the next section, we examine the properties of the surfaces in more detail.
38.4 Boundaries and chains
Here we examine the properties of the generalized surfaces we integrate over. First a definition: the boundary del M\partial M of a region of MM, consists of those points of MM that do not lie in the interior.
Example 38.4
The boundary of the closed unit disc is the unit circle.
The boundary of a finite cylinder is two circles.
The boundary of a line is its two end points.
The boundary of the unit sphere is empty (i.e. there isn't one).
We regard del\partial as an operator that extracts the boundary from a space MM. Some examples of the use of this operator are shown in Fig. 38.1.
With boundaries at our disposal we turn to the main topic of this section. The generalized surfaces over which we integrate form a family of objects we call chains. ^(13){ }^{13} As described above, forms are dual to chains, in that we combine them to output a number. We define and denote our chains as follows:
The boundary operator del\partial acts on the chains, mapping a chain C_(n)C_{n} on to a chain C_(n-1)C_{n-1}. That is, it inputs an nn-chain and outputs an (n-1)(n-1)-chain. We have, therefore, that
Some chains have no boundaries. A closed line like a circle has no boundary. We call these boundary-free chains closed chains or cycles, often denoted Z_(n)Z_{n} with the property delZ_(n)=0\partial Z_{n}=0. Some chains are boundaries to higher dimensional chains. A closed surface is boundary to a volume; a closed line is boundary to a surface; the boundary of a line is two points. In short, the boundary of an n+1n+1-chain is an nn-chain. If B_(n)B_{n} is the boundary of an (n+1)(n+1)-chain we have
Combining the two previous equations we see that del delC_(n+1)=0\partial \partial C_{n+1}=0. Since there was nothing special about C_(n+1)C_{n+1} we conclude that
or, in words, the boundary of a boundary is zero. ^(14){ }^{14}
38.5 Stokes' theorem
An integral involves the inner product of an nn-chain and an nn-form. The pinnacle of this description of calculus in terms of chains and forms is Stokes' theorem. ^(15){ }^{15} In order to understand the content of the theorem, we need to bring in one final (but familiar) piece of technology: the exterior derivative d\boldsymbol{d}. The key to Stokes' theorem is to note that just as the boundary operator del\partial takes a (n+1)(n+1)-chain and outputs an nn chain, the exterior derivative operator d\boldsymbol{d} takes a nn form and outputs a ( n+1n+1 )-form.
Example 38.5
Recall that in three dimensions with basis 1 -forms dx,dy\boldsymbol{d} x, \boldsymbol{d} y and dz\boldsymbol{d} z, we have the following forms:
{:[" 0-form ", tilde(Z)=f],[" 1-form ",Y=Adx+Bdy+Cdz],[" 2-form ", tilde(X)=adx^^dy+bdy^^dz+cdz^^dx],[" 3-form ", tilde(W)=Fdx^^dy^^dz]:}\begin{array}{ll}
\text { 0-form } & \tilde{\boldsymbol{Z}}=f \\
\text { 1-form } & \boldsymbol{\boldsymbol { Y }}=A \boldsymbol{d} x+B \boldsymbol{d} y+C \boldsymbol{d} z \\
\text { 2-form } & \tilde{\boldsymbol{X}}=a \boldsymbol{d} x \wedge \boldsymbol{d} y+b \boldsymbol{d} y \wedge \boldsymbol{d} z+c \boldsymbol{d} z \wedge \boldsymbol{d} x \\
\text { 3-form } & \tilde{\boldsymbol{W}}=F \boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z
\end{array}
We can operate with the exterior derivative operator on the 1-form to make
{:[d tilde(Y)=(del A)/(del y)dy^^dx+(del A)/(del z)dz^^dx],[+(del B)/(del x)dx^^dy+(del B)/(del z)dz^^dy],[+(del C)/(del x)dx^^dz+(del C)/(del y)dy^^dz],[=((del C)/(del y)-(del B)/(del z))dy^^dz+((del A)/(del z)-(del C)/(del x))dz^^dx],[(38.40)+((del B)/(del x)-(del A)/(del y))dx^^dy]:}\begin{align*}
\boldsymbol{d} \tilde{\boldsymbol{Y}} & =\frac{\partial A}{\partial y} \boldsymbol{d} y \wedge \boldsymbol{d} x+\frac{\partial A}{\partial z} \boldsymbol{d} z \wedge \boldsymbol{d} x \\
& +\frac{\partial B}{\partial x} \boldsymbol{d} x \wedge \boldsymbol{d} y+\frac{\partial B}{\partial z} \boldsymbol{d} z \wedge \boldsymbol{d} y \\
& +\frac{\partial C}{\partial x} \boldsymbol{d} x \wedge \boldsymbol{d} z+\frac{\partial C}{\partial y} \boldsymbol{d} y \wedge \boldsymbol{d} z \\
= & \left(\frac{\partial C}{\partial y}-\frac{\partial B}{\partial z}\right) \boldsymbol{d} y \wedge \boldsymbol{d} z+\left(\frac{\partial A}{\partial z}-\frac{\partial C}{\partial x}\right) \boldsymbol{d} z \wedge \boldsymbol{d} x \\
& +\left(\frac{\partial B}{\partial x}-\frac{\partial A}{\partial y}\right) \boldsymbol{d} x \wedge \boldsymbol{d} y \tag{38.40}
\end{align*}
However, if we operate a second time, we find dd tilde(Y)=0\boldsymbol{d} \boldsymbol{d} \tilde{\boldsymbol{Y}}=0. This property that two operations of the exterior derivative operator yields zero is a general one. Trying again by starting on the 2 -form, we find
{:(38.41)d tilde(X)=((del a)/(del z)+(del b)/(del x)+(del c)/(del y))dx^^dy^^dz:}\begin{equation*}
\boldsymbol{d} \tilde{\boldsymbol{X}}=\left(\frac{\partial a}{\partial z}+\frac{\partial b}{\partial x}+\frac{\partial c}{\partial y}\right) \boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z \tag{38.41}
\end{equation*}
but then dd tilde(X)=0\boldsymbol{d d} \tilde{\boldsymbol{X}}=0.
Each of the results from the last example are summarized with the expression, familiar from Chapter 33 that
{:(38.42)dd=0.:}\begin{equation*}
d d=0 . \tag{38.42}
\end{equation*}
^(16){ }^{16} An nn-form tilde(A)\tilde{\boldsymbol{A}} is closed if d tilde(A)=0d \tilde{\boldsymbol{A}}=0 An nn-form is called exact if it is the derivative of an (n-1)(n-1)-form. All exact forms are clearly closed. Are all closed forms exact? That is, is the property that the exterior derivative gives zero enough to guarantee that we can write the form as the exterior derivative of another object. The Poincare lemma says that if d tilde(A)=0\boldsymbol{d} \tilde{\boldsymbol{A}}=0 throughout a simply connected region of space, then tilde(A)=d tilde(B)\tilde{\boldsymbol{A}}=\boldsymbol{d} \tilde{B} for some tilde(B)\tilde{B} and so the form is exact. A space that is not simply connected is discussed in the exercises. ^(17){ }^{17} This fussy definition really just means we need to define a consistent positive sense of pointing away from the volume CC. ^(18){ }^{18} Stokes' theorem is proved in Needham (2021) and, at a more advanced level, in Spivak's Calculus on Manifolds, both of which are highly recommended. Of Stokes' theorem, Spivak notes: '1. It is trivial. 2. It is trivial because the terms appearing in it have been properly defined. 3 . It has significant consequences.' ^(19){ }^{19} Recall that in the coordinate frame, we need a factor [(-1)^(s)detg_(mu nu)]^((1)/(2))\left[(-1)^{s} \operatorname{det} g_{\mu \nu}\right]^{\frac{1}{2}}, where ss is the number of minuses in the metric signature. We'll assume here that, since we're looking at threedimensional space, s=0s=0 and write detg_(mu nu)=g\operatorname{det} g_{\mu \nu}=g.
This equation is the dual equation of del^(2)=0.^(16)\partial^{2}=0 .{ }^{16}
With the operators d\boldsymbol{d} and del\partial in hand, we can write down Stokes' theorem. The content of the theorem reflects the ability to use the operators to equate the inner product of an (n+1)(n+1)-chain CC and (n+1)(n+1)-form d Omega\boldsymbol{d} \boldsymbol{\Omega}, with that of an nn-chain del C\partial C and nn-form tilde(Omega)\tilde{\boldsymbol{\Omega}}.
Stokes' theorem says that if tilde(Omega)\tilde{\boldsymbol{\Omega}} is a nn-form and CC is a (n+1)(n+1)-chain, then we have
{:(38.43)int_(del C) tilde(Omega)=int_(C)d tilde(Omega):}\begin{equation*}
\int_{\partial C} \tilde{\Omega}=\int_{C} d \tilde{\Omega} \tag{38.43}
\end{equation*}
where the orientation of the surface del C\partial C must be chosen ^(17){ }^{17} such that if the orientation for CC is
We shall not prove Stokes' theorem here, ^(18){ }^{18} instead, we put the theorem to work in the examples below. We warm up with the computation of some volume elements.
Example 38.6
Working in a coordinate frame in three-dimensional space with coordinate system (x,y,z)(x, y, z) and a metric g\boldsymbol{g}, we want to integrate a function f(x,y,x)f(x, y, x) over a 3 -volume VV. The volume 3 -form ^(19){ }^{19} for the space is given by
The chain C_(3)=VC_{3}=V tells us the size of the volume over which to integrate, using an integration volume element dV\mathrm{d} V. The latter is a box with sides dxe_(x),dye_(y)\mathrm{d} x e_{x}, \mathrm{~d} y e_{y} and dze_(z)\mathrm{d} z e_{z}, such that we have dV= tilde(omega)(dxe_(x),(d)ye_(y),(d)ze_(z))=sqrtgdxdydz\mathrm{d} V=\tilde{\boldsymbol{\omega}}\left(\mathrm{d} x \boldsymbol{e}_{x}, \mathrm{~d} y \boldsymbol{e}_{y}, \mathrm{~d} z \boldsymbol{e}_{z}\right)=\sqrt{g} \mathrm{~d} x \mathrm{~d} y \mathrm{~d} z. A volume integral is then written as
{:(38.47)int_(C_(3))f tilde(omega)=int_(C_(3))fsqrtgdx^^dy^^dz=int_(V)f(x","y","z)sqrtgdxdydz:}\begin{equation*}
\int_{C_{3}} f \tilde{\boldsymbol{\omega}}=\int_{C_{3}} f \sqrt{g} \boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z=\int_{V} f(x, y, z) \sqrt{g} \mathrm{~d} x \mathrm{~d} y \mathrm{~d} z \tag{38.47}
\end{equation*}
In flat space, of course, we have g=1g=1.
Example 38.7
Let's integrate a function f(x,y,z)f(x, y, z) over an area SS in flat space. ^(20){ }^{20} One way to construct the integration surface element is to identify an element spanned by vectors du\mathrm{d} \boldsymbol{u} and dv\mathrm{d} \boldsymbol{v} with unit outward normal hat(n)\hat{\boldsymbol{n}} and write as
where dS=du xxdv\mathrm{d} \boldsymbol{S}=\mathrm{d} \boldsymbol{u} \times \mathrm{d} \boldsymbol{v}. We spot that the expression in eqn 38.48 is equivalent to the 2 -form
{:(38.49) tilde(Sigma)(",")= hat(n)^(x)dy^^dz+ hat(n)^(y)dz^^dx+ hat(n)^(z)dx^^dy:}\begin{equation*}
\tilde{\mathbf{\Sigma}}(,)=\hat{n}^{x} \boldsymbol{d} y \wedge \boldsymbol{d} z+\hat{n}^{y} \boldsymbol{d} z \wedge \boldsymbol{d} x+\hat{n}^{z} \boldsymbol{d} x \wedge \boldsymbol{d} y \tag{38.49}
\end{equation*}
when its slots are filled with du\mathrm{d} \boldsymbol{u} and dv\mathrm{d} \boldsymbol{v}. We can therefore perform the surface integral
{:(38.50)int_(C_(2))f tilde(Sigma)=int_(S)f(x","y","z) hat(n)*dS:}\begin{equation*}
\int_{C_{2}} f \tilde{\boldsymbol{\Sigma}}=\int_{S} f(x, y, z) \hat{\boldsymbol{n}} \cdot \mathrm{d} \boldsymbol{S} \tag{38.50}
\end{equation*}
Example 38.8
We can now use Stokes' theorem to see how the flat-space integral theorems of vector calculus can be described in the new language by considering a function ff (i.e. a 0 -form) integrated over the geometries shown in Fig. 38.2. In Fig. 38.2(a), we have
as we saw in Example 12. Similarly, for Fig. 38.2(b) we have
{:(38.5)int_(S)df^^dx^(i)=int_(Gamma)fdx^(i):}\begin{equation*}
\int_{S} \boldsymbol{d} f \wedge \boldsymbol{d} x^{i}=\int_{\Gamma} f \boldsymbol{d} x^{i} \tag{38.5}
\end{equation*}
and for the volume VV in Fig. 38.2(c) we obtain
{:(38.53)int_(V)df^^dx^(i)^^dx^(j)=int_(Sigma)fdx^(i)^^dx^(j):}\begin{equation*}
\int_{V} \boldsymbol{d} f \wedge \boldsymbol{d} x^{i} \wedge \boldsymbol{d} x^{j}=\int_{\Sigma} f \boldsymbol{d} x^{i} \wedge \boldsymbol{d} x^{j} \tag{38.53}
\end{equation*}
To investigate this final geometry in more detail, consider the 2-form
{:(38.54) tilde(A)=A_(x)dy^^dz+A_(y)dz^^dx+A_(z)dx^^dy:}\begin{equation*}
\tilde{\boldsymbol{A}}=A_{x} \boldsymbol{d} y \wedge \boldsymbol{d} z+A_{y} \boldsymbol{d} z \wedge \boldsymbol{d} x+A_{z} \boldsymbol{d} x \wedge \boldsymbol{d} y \tag{38.54}
\end{equation*}
and the 3-chain C_(3)=VC_{3}=V, a volume in flat space, with boundary del V=Sigma\partial V=\Sigma. We then have, by Stokes' theorem
{:[int_(Sigma)A_(x)dy^^dz+A_(y)dz^^dx+A_(z)dx],[(38.55)^^dy],[=int_(V)((delA_(x))/(del x)+(delA_(y))/(del y)+(delA_(z))/(del z))dx^^dy^^dz]:}\begin{align*}
& \int_{\Sigma} A_{x} \boldsymbol{d} y \wedge \boldsymbol{d} z+A_{y} \boldsymbol{d} z \wedge \boldsymbol{d} x+A_{z} \boldsymbol{d} x \\
& \wedge \boldsymbol{d} y \tag{38.55}\\
& =\int_{V}\left(\frac{\partial A_{x}}{\partial x}+\frac{\partial A_{y}}{\partial y}+\frac{\partial A_{z}}{\partial z}\right) \boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z
\end{align*}
In ordinary Cartesian coordinates, this is equivalent to
{:(38.56)int_(Sigma)A_(x)dydz+A_(y)dzdx+A_(z)dxdy=int_(V) vec(grad)* vec(A)dxdydz:}\begin{equation*}
\int_{\Sigma} A_{x} \mathrm{~d} y \mathrm{~d} z+A_{y} \mathrm{~d} z \mathrm{~d} x+A_{z} \mathrm{~d} x \mathrm{~d} y=\int_{V} \vec{\nabla} \cdot \vec{A} \mathrm{~d} x \mathrm{~d} y \mathrm{~d} z \tag{38.56}
\end{equation*}
or, in terms of the element d vec(Sigma)\mathrm{d} \vec{\Sigma} of the surface Sigma\Sigma and element dV\mathrm{d} V of the volume VV,
Stokes' theorem is an essential part of our geometrical description of the conservation of charge in electromagnetism, and also of the constraints on curvature caused by the geometry of spacetime. These topics will be introduced as we take a closer look at the physics of fields, which forms the final part of the book.
(b)
(c)
Fig. 38.2 Geometries for the theorems of integral calculus.
Chapter summary
All integrals can be expressed as an integral over a form. The integral expresses the inner product of the nn-form and the nn-vector that represents the surface over which the integral is carried out.
The general surfaces over which we integrate are chains. The boundaries of chains are found with the boundary operator.
del del=0\partial \partial=0 expresses the fact that a boundary of a boundary is zero.
Stokes' theorem says that
{:(38.58)int_(del C) tilde(Omega)=int_(C)d tilde(Omega):}\begin{equation*}
\int_{\partial C} \tilde{\boldsymbol{\Omega}}=\int_{C} d \tilde{\boldsymbol{\Omega}} \tag{38.58}
\end{equation*}
Exercises
(38.1) Consider the 1 -form tilde(A)=A_(i)dx^(i)\tilde{\boldsymbol{A}}=A_{i} \boldsymbol{d} x^{i} in flat, threedimensional space. Use Stokes' theorem to prove the Kelvin-Stokes theorem
(b) Show that the 2-form tilde(G)=x^(1)dx^(2)^^dx^(3)\tilde{\boldsymbol{G}}=x^{1} \boldsymbol{d} x^{2} \wedge \boldsymbol{d} x^{3} defined in three-dimensional space, is not exact on S^(2)S^{2}.
(38.3) Consider the vortex 1-form field
where r^(2)=x^(2)+y^(2)r^{2}=x^{2}+y^{2} and AA is a constant. This form is not exact since tilde(phi)!=df\tilde{\phi} \neq \boldsymbol{d} f, for any function ff. Show, however, that the vortex form is closed, with d phi=0\boldsymbol{d} \phi=0.
This is an example of a closed form that is not exact. Although we appear to be working in Euclidean 2-space (which is simply connected) where we expect closed forms to be exact, the field is actually singular at the origin. As a result, we must remove this troublesome point. We are not therefore really dealing with a simply connected space and so Poincaré's lemma does not apply.
(38.4) (a) By computing connection coefficients in the orthonormal frame, show that the geodesic equation on the surface of a unit sphere can be written as
where (x^(1),x^(2))=(theta,phi)\left(x^{1}, x^{2}\right)=(\theta, \phi) and dots denote derivatives with respect to the proper time tau\tau.
This equation can be solved by a velocity of the form u(t)=e^(A(t))u(0)\boldsymbol{u}(t)=\mathrm{e}^{\boldsymbol{A}(t)} \boldsymbol{u}(0) where, for our chosen basis,
where A\mathcal{A} is the area enclosed by the loop.
(d) Hence, show that if the vector u\boldsymbol{u} is paralle
transported around a closed loop, the result can be described by the operation R_u\underline{\boldsymbol{R}} \boldsymbol{u}, where R_\underline{\boldsymbol{R}} is a matrix corresponding to rotation by an angle A\mathcal{A}. See Blennow and Ohlsson, whose approach we have
followed here, for a more complete discussion of this problem. The result of this problem is a useful one in magnetism for computing geometric phases.
Part VI
Classical and quantum fields
Having sharpened our geometrical tools in Part V, we now apply these ideas to a geometrical description of the physics of general relativity. Our goal is to understand the physics of relativity from both the point of view of differential geometry and also from that of classical field theory. 'Classical' here means pre-quantum, but including relativity. The extension to quantum physics is our focus towards the end of this part of the book.
In Chapter 39, we examine the physics of fluids, including their equations of motion and thermodynamics.
In Chapter 40, we approach the Einstein equations from the point of view of classical field theory where using the idea of a Lagrangian density we can see how general relativity fits into the framework of field theory. We apply this to the problem of the early Universe in Chapter 41.
In Chapter 42, we formulate a geometrical description of electromagnetism. This is developed in Chapter 43, where we use differential geometry to understand the conservation of charge.
In Chapter 44, we examine the link between relativity and the very influential idea of a gauge in a field theory.
In Chapter 45, we examine gravitation in the domain where it is usually encountered: the limit of small field. This enables us to discuss gravitational radiation in Chapter 46.
In Chapters 47-49, we investigate how quantum mechanics might be linked with gravitation, and how having extra dimensions available might enable this.
Finally, in Chapter 50, we show the inevitability of a Big Bang from our geometrical viewpoint and its reliance on smooth spacetimes.
Fluids as dry water
39.1 Euler's equation 413
39.2 Energy and Bernoulli's equa
39.3 Energy-momentum tensor
39.4 Relativistic fluids 420
Chapter summary 425
Exercises 425 ↷\curvearrowright The material in this chapter fills in some background on how fluids are treated in physics. It can be skipped on a first reading. ^(1){ }^{1} This approach is adopted in Vol. II, Lecture 40 of the Feynman Lectures in Physics, R.P. Feynman, R. Leighton and M. Sands. In the lecture after this, Feynman examines some of the consequences of including viscosity, and the interested reader is advised to look there in the first instance. ^(2){ }^{2} We shall distinguish the mass density rho_(0)\rho_{0} of the fluid and the energy density rhoc^(2)\rho c^{2}, which includes a contribution from rho_(0)c^(2)\rho_{0} c^{2} and also sources of internal energy
Fig. 39.1 Flow lines form a congruence of curves. The tangents give the velocity field.
He who drinks a tumbler of water in London has literally in his stomach more animated beings than there are men, women and children on the face of the globe.
Sydney Smith (1771-1845)
A particularly satisfying classical field theory is that of the fluid. In this chapter, we start by examining a non-relativistic fluid in flat spacetime, and formulate mass conservation and an equation of motion, and then look at the consequences of energy conservation. We then turn to the problem of the relativistic fluid. At the risk of being perverse, we shall ignore exactly that feature that gives fluids their characteristic wetness: viscosity. This omission is made because the fluids we describe in the context of relativity are well described as inviscid (i.e. viscosityfree). This approach ^(1){ }^{1} also has the attractive property of making the subject as straightforward as possible.
A fluid is matter that continuously deforms if it is subject to a shear stress. In modelling a fluid, we identify elements of the fluid, initially at some coordinate x^(mu)=(t, vec(x))x^{\mu}=(t, \vec{x}), that we expect to move around. This movement is called a flow. To describe a fluid, we specify its mass density, ^(2){ }^{2} described by a scalar field rho_(0)(t, vec(x))\rho_{0}(t, \vec{x}) and its velocity, which is described by a vector field vec(v)(t, vec(x))\vec{v}(t, \vec{x}).
The velocity is best thought of as being a tangent field. The tangents in question are tangents to a congruence of curves. As usual, the congruence fills all of the space, so that there is a single tangent at a particular point. This tangent is identical to the velocity vector field evaluated at that point (Fig. 39.1). The congruent curves are known as the streamlines of the fluid. (They are also known as the flow lines, and they do indeed describe the flow of the fluid.) Defined in this way, the flow lines tell us about the velocities of the elements of the fluid, rather than their trajectories. Flow lines are only identical to the trajectories of the fluid elements in the case of steady flow of the fluid.
We want to ensure that the mass of the fluid is locally conserved, which is to say that (in the absence of sources and sinks of mass), the change of fluid mass in a particular volume is accounted for by the flow of mass into or out of that volume. We define the fluid's mass current vector J\boldsymbol{J}, which has components J^(mu)=(rho_(0),rho_(0)( vec(v)))J^{\mu}=\left(\rho_{0}, \rho_{0} \vec{v}\right). Conservation of mass is then written as grad*J=0\boldsymbol{\nabla} \cdot \boldsymbol{J}=0, or in components
A fluid with vec(omega)=0\vec{\omega}=0 is called irrotational.
39.1 Euler's equation
We would like to identify an equation of motion for the fluid. We shall do this non-relativistically by using Newton's second law. If a volume of fluid has a pressure p(t, vec(x))p(t, \vec{x}) then we have access to the local force on an element of fluid
{:(39.3) vec(F)=-ointd vec(S)p(t"," vec(x))=-intd^(3)x vec(grad)p:}\begin{equation*}
\vec{F}=-\oint \mathrm{d} \vec{S} p(t, \vec{x})=-\int \mathrm{d}^{3} x \vec{\nabla} p \tag{39.3}
\end{equation*}
where d vec(S)\mathrm{d} \vec{S} is a directed element of area of the fluid element, and d^(3)x\mathrm{d}^{3} x is an element of volume (Fig. 39.2). The equation effectively defines the pressure of the fluid in terms of it providing an inward-directed force (hence the minus sign) on an element. ^(3){ }^{3} The product of mass and acceleration of the fluid is
Equating these two expressions ^(4){ }^{4} gives an equation of motion
{:(39.6)rho_(0)((d)( vec(v)))/((d)t)=- vec(grad)p:}\begin{equation*}
\rho_{0} \frac{\mathrm{~d} \vec{v}}{\mathrm{~d} t}=-\vec{\nabla} p \tag{39.6}
\end{equation*}
At this point, we need to pause and consider the details of the frames of reference we are using. The velocity field is a function of time and position. The total time derivative d//dt\mathrm{d} / \mathrm{d} t we have assumed does not fix the space coordinates like a partial derivative does. ^(5){ }^{5} The total derivative therefore corresponds to the rate of change of the velocity as observed by an observer who travels along with fluid, so that their spatial coordinate is carried along with the element of fluid under consideration. This is the same situation we had in Chapter 15 where we described the observer as comoving. We conclude that the comoving observer probes the fluid using the total derivative d//dt\mathrm{d} / \mathrm{d} t.
Example 39.1
The total derivative describes the rate of change measured by an observer moving locally with the fluid. However, we typically have access to the partial derivative: the rate of change of the fluid from our fixed frame of reference, relative to which the the rate of change of the flud from our fixed frame of reference, relative to which the
flan ii th component of a vector field vec(A)(t, vec(x))\vec{A}(t, \vec{x}), we find
Fig. 39.2 An outwardly directed element of fluid surface at pressure pp used to define the force. ^(3){ }^{3} The surface integral in eqn 39.3 is taken over the surface of the element of fluid, on which the pressure of the rest of the fluid acts. From the volume integral we see immediately that if there is no gradient in the pressure field, there no gradient in the pressure field, there
is no net force on the element, which makes sense. ^(4){ }^{4} That is, we find that
and so the integrand must vanish. ^(5){ }^{5} The partial derivative with respect to time should be interpreted as reading
((del)/(del t))_({:[" fixed "],[" position "]:})\left(\frac{\partial}{\partial t}\right)_{\substack{\text { fixed } \\ \text { position }}}
^(6){ }^{6} We summarize two useful rules for non-relativistic fluid dynamics:
(i) The derivative rule: we use d//dt\mathrm{d} / \mathrm{d} t in the comoving frame and del//del t+ vec(v)* vec(grad)\partial / \partial t+\vec{v} \cdot \vec{\nabla} in the stationary frame.
(ii) The divergence rule: the velocity's divergence (a quantity evaluated in the stationary frame) is given in the comoving frame in terms of the rate of change of the volume of a fluid element, viz.
where v^(j)=delx^(j)//del tv^{j}=\partial x^{j} / \partial t is a component of the fluid's velocity. This expression provides a useful link between total derivative (appropriate for the comoving observer) and partial derivative (appropriate for the stationary observer).
Using the previous example, we have the useful equation in vector notation, that
Using this operator, we can convert our equations between comoving and stationary frames. Notice how the right-hand side of this equation has two contributions: (i) the rate of change of the vector field at a fixed spatial coordinate, added to (ii) the directional derivative (i.e. the derivative directed along the velocity vector vec(v)\vec{v} ). Another way of describing the directional derivative is that vec(v)* vec(grad)\vec{v} \cdot \vec{\nabla} represents a spatial derivative taken along the flow lines.
Example 39.2
Our mass conservation equation grad*J=0\boldsymbol{\nabla} \cdot \boldsymbol{J}=0 was previously written in terms of partial derivatives and so corresponds to the viewpoint of a stationary observer. We can use the convective derivative equation to shift this into the comoving frame. First expand the conservation equation, and then use the convective derivative to yield
In the comoving frame, the observer who follows the fluid is able to identify a fixed mass of fluid MM, which should not vary as a function of time. The mass of fluid instantaneously fills some volume VV and has a density rho_(0)\rho_{0} with M=rho_(0)VM=\rho_{0} V. We have, therefore, that
Combining the previous two exprescions allows us to conclude that the divergence of the velocity is given in terms of the fluid's volume by ^(6){ }^{6}
Expanding the equation of motion (eqn 39.6) in terms of the convective derivative, results in Euler's equation for the motion of the fluid for the stationary observer, which reads
The existence of a hydrostatic fluid allows us to derive ^(7){ }^{7} Pascal's law. Expanding the last expression in Cartesian coordinates and assuming gravity acts in the -z-z direction, we have
which is Pascal's law, giving the variation of pressure with depth in a hydrostatic fluid.
The gravitational field can be written in terms of a gravitational potential vec(g)=- vec(grad)Phi\vec{g}=-\vec{\nabla} \Phi. The gravitational potential fits more comfortably into a description of fluids in terms of their energy.
Example 39.4
Consider the cross product of the velocity and the vorticity ( vec(omega)= vec(grad)xx vec(v)\vec{\omega}=\vec{\nabla} \times \vec{v} )
where we have used the rule for a double cross product in the second line. This expression can be used to remove the directional derivative term from Euler's equation (eqn 39.12). We also take the opportunity to replace vec(g)\vec{g} with - vec(grad)Phi-\vec{\nabla} \Phi, with the result
This is a restatement of the Euler equation, but is also the most general form of what is known as Bernoulli's equation. ^(8){ }^{8} We shall see how this equation arises from energetic considerations in the next section.
The Euler equation gives us an equation of motion for the fluid. However, there's lots of additional insight we can gain into motion if we consider the energetics of the fluid. These considerations are our next topic.
39.2 Energy and Bernoulli's equation
The thermodynamics of the fluid at an equilibrium temperature TT may be investigated using the first law of thermodynamics
{:(39.19)dU=TdS-pdV:}\begin{equation*}
\mathrm{d} U=T \mathrm{~d} S-p \mathrm{~d} V \tag{39.19}
\end{equation*}
^(7){ }^{7} Blaise Pascal (1623-1662), mathematician, physicist, philosopher, theologian and inventor, was described by Roberto Rossellini as 'a boring man who never made love in his life'. Rossellini made a well-regarded film about Pascal's life for French television in 1972. ^(8){ }^{8} Daniel Bernoulli (1700-1782) was the son of Johann Bernoulli (famed for his developments in calculus) and the nephew of Jacob Bernoulli (who founded the study of probability). Daniel's work on fluids was published as Hydrodynamica. Parts of his work were plagiarized by his father and published in Johann's book Hydraulica. Such behaviour is not uncommon in the story of the Bernoulli family. ^(9){ }^{9} Writing U(S,V)U(S, V) implies that entropy and volume are the natural variables of UU. These are the quantities that appear as differentials in the first law.
where U(S,V)U(S, V) is the internal energy and SS is the entropy. ^(9){ }^{9} Of course,
the first law simply expresses energy conservation. We want to work with the first law in units where the internal energy and the entropy are given per unit mass (these quantities are denoted by lower-case letters). Unit mass implies that volume VV can be replaced with 1//rho_(0)1 / \rho_{0} and we have a unit-mass version of the first law, written as
The natural variables of uu are ss and rho_(0)\rho_{0}. In situations where the natural variables of a problem are ss and pp, it is often useful to describe the physics using the enthalpy H(S,p)=U+pVH(S, p)=U+p V and so we have, in unit mass form,
where hh is enthalpy per unit mass. This implies we can write the enthalpy as h=u+p//rho_(0)h=u+p / \rho_{0}.
In a perfect fluid, there is no dissipation, and so we must have no entropy production, i.e. ds=0\mathrm{d} s=0. In this case,
From the former expression for the first law du=-pd(1//rho_(0))\mathrm{d} u=-p \mathrm{~d}\left(1 / \rho_{0}\right), we find an expression for the pressure of a perfect fluid, which is
Evaluated in the static frame, we note that del s//del t=0\partial s / \partial t=0 for the perfect fluid. However, the entropy is constant when following the flow (in the comoving frame with d//dt\mathrm{d} / \mathrm{d} t ) as well as in the static frame (where we use the convective derivative del//del t+ vec(v)* vec(grad)\partial / \partial t+\vec{v} \cdot \vec{\nabla} ), with the consequence that we can write
This latter equation is useful in simplifying thermodynamic descriptions of a perfect fluid.
For the special case of steady flow, we have the defining property (del)/(del t)(\frac{\partial}{\partial t}( anything )=0)=0. As a result, we can use a simplifying expression
This simply says that when the flow is steady, the rate of change of quantities with time for the comoving observer is given by the spatial derivative along the flow lines.
For the perfect fluid in a state of steady flow, eqn (39.25) implies two thermodynamic equations for the enthalpy, one for the comoving frame and one for the stationary frame. These are, respectively,
Spot that vec(v)* vec(v)xx omega=0\vec{v} \cdot \vec{v} \times \omega=0 and then consider the case for steady flow by setting del//del t=0\partial / \partial t=0 to obtain
where we have used the second of eqn 39.26 to substitute the pressure with the enthalpy for the case of a perfect fluid. This resulting expression requires a little unpacking at this stage.
From the last example, we have a result given in terms of the operator vec(v)* vec(grad)\vec{v} \cdot \vec{\nabla} appropriate for the stationary observer. Since this is steady flow, we can switch to the comoving frame using the rule that vec(v)* vec(grad)rarr((d))/((d)t)\vec{v} \cdot \vec{\nabla} \rightarrow \frac{\mathrm{~d}}{\mathrm{~d} t} as we move along the flow lines. This gives an alternative form for Bernoulli's equation for steady flow of a perfect fluid in the comoving frame as
Notice how the terms in the bracket represent kinetic energy, gravitational energy and enthalpy. We conclude that for steady flow, Bernoulli's equation, which we introduced as a restatement of Euler's equation, simply expresses conservation of energy. It applies for the comoving observer, which is to say that conservation of energy in the form
applies along the flow lines. Since h=u+p//rho_(0)h=u+p / \rho_{0}, we can rewrite the conservation of energy equation with pressure included explicitly. That is, along the flow lines we have that Bernoulli's equation reads
To complete our look at the energetics of the fluid we consider an irrotational flow, where vec(grad)xx vec(v)=0\vec{\nabla} \times \vec{v}=0. In this case we can introduce a velocity potential phi\phi via
which has the property vec(grad)xx vec(grad)phi=0\vec{\nabla} \times \vec{\nabla} \phi=0. Let's see how this works in Bernoulli's equation.
Example 39.7
From the first law for enthalpy we have dh=Tds+dp//rho_(0)\mathrm{d} h=T \mathrm{~d} s+\mathrm{d} p / \rho_{0}. Taking derivatives with respect to the spatial coordinates x^(i)x^{i} we have
If the flow is isentropic the middle term falls out and we have vec(grad)h= vec(grad)p//rho_(0)\vec{\nabla} h=\vec{\nabla} p / \rho_{0}. As a result, we now write an expression for unsteady, isentropic, irrotational flow of a perfect fluid. Once again, start with Bernoulli's equation
and then use vec(v)= vec(grad)phi, vec(omega)=0\vec{v}=\vec{\nabla} \phi, \vec{\omega}=0 and vec(grad)h= vec(grad)p//rho_(0)\vec{\nabla} h=\vec{\nabla} p / \rho_{0} to say
To guarantee that the right-hand side of this last equation vanishes, we part in the bracket must be constant in space.
The result of the last example is Bernoulli's equation for unsteady flow of an irrotational, perfect fluid. Notice how the operator on the bracket was vec(grad)\vec{\nabla} and not vec(v)* vec(grad)\vec{v} \cdot \vec{\nabla}. This implies that the equation does not just apply along the flow lines (identified with vec(v)*grad\vec{v} \cdot \nabla ) but everywhere. That is to say, throughout the fluid, we have
This differs from the steady flow version through the time derivative of the velocity potential and in its realm of applicability.
39.3 Energy-momentum tensor
We have already seen the importance of the energy-momentum tensor. Here we consider its form in the non-relativistic fluid. The main message here is that we can solve many matter-field problems using the key equation
where 4 -vectors are used for velocity v\boldsymbol{v} and g\boldsymbol{g} is the metric. Here rho\rho is the energy density of the fluid (as measured in its local rest frame). In general, it will receive contributions from the mass density rho_(0)\rho_{0} and also the internal energy density rho_(0)u\rho_{0} u, where uu is the specific internal energy of the fluid. ^(11){ }^{11} In the case of flat spacetime, we have g=eta\boldsymbol{g}=\boldsymbol{\eta} and so, if fluid has a non-relativistic velocity v\boldsymbol{v}, we have rho≫p//c^(2)\rho \gg p / c^{2} and
This is a statement of conservation of energy. From the first law we see the left-hand side gives del p//del t=rho_(0)del h//del t\partial p / \partial t=\rho_{0} \partial h / \partial t, telling us about sources and sinks of energy. The righthand side expresses the local flow of energy density grad*J\boldsymbol{\nabla} \cdot \boldsymbol{J}. In the absence of energy sources, we have grad*J=0\boldsymbol{\nabla} \cdot \boldsymbol{J}=0.
Considering next the spatial components of eqn 39.43 (by setting nu=i\nu=i ), we find
If we assume that the internal energy density contribution to rho\rho is small in the nonrelativistic limit, then the first three terms express local conservation of mass-density. Setting them to zero (i.e. assuming conservation of mass), the final three terms yield
We see how the conservation of energy-momentum approach yields up both (i) the conservation of mass-energy and (ii) the equations of motion. We shall find the same idea applies for relativistic fluids, allowing us to extract a wealth of information from grad*T=0\boldsymbol{\nabla} \cdot \boldsymbol{T}=0. ^(11){ }^{11} Restoring factors of cc, we have an energy density rhoc^(2)=rho_(0)c^(2)+rho_(0)u\rho c^{2}=\rho_{0} c^{2}+\rho_{0} u. In natural units, we write rho=rho_(0)(1+u)\rho=\rho_{0}(1+u). ^(12){ }^{12} The untidy justification presented here can be made more respectable by projecting out parts of the divergence using the velocity vector. That method is presented in the next section.
39.4 Relativistic fluids
We shall apply the same methodology that we applied to the nonrelativistic fluid to its relativistic counterpart. In particular, we require relativistic upgrades of the appropriate derivatives, expressions for energy and for the energy-momentum tensor.
Mathematically, relativistic fluids are described by an energy density and a set of flow lines. The flow lines are the integral curves of the velocity field ^(13)u{ }^{13} \boldsymbol{u}. The (3+1)-dimensional derivative along the flow lines is given by the covariant derivative u*grad=grad_(u)\boldsymbol{u} \cdot \boldsymbol{\nabla}=\boldsymbol{\nabla}_{\boldsymbol{u}}. In general, the flow lines are not geodesics described by grad_(u)u=0\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{u}=0. We shall see that the equation of motion (a relativistic Euler equation) tells us directly how much the flow lines depart from being geodesics. The departure is caused by gradients in pressure.
The relativistic fluid in which we're interested is unlikely to be very fast-moving water, or similar. Rather, we are interested in the cosmological fluid filling spacetime, or the energetic matter inside stars. This is made up, principally of massive particles we'll call ^(14){ }^{14} baryons. We also need to be slightly more careful about our definitions of thermodynamic variables. Let's define a set of objects n,rho,p,Tsn, \rho, p, T s. These are scalar fields and so vary as a function of position in spacetime. In the rest frame of an element of fluid, they correspond to the results of measurements of the following physical quantities: ^(15){ }^{15}
The number density of baryons is nn. If the average rest mass per baryon is bar(m)_(b)\bar{m}_{\mathrm{b}}, then the rest-mass density of baryons is rho_(0)= bar(m)_(b)n\rho_{0}=\bar{m}_{\mathrm{b}} n.
The total energy density is rho\rho. This includes the mass-energy of baryons rho_(0)\rho_{0} and internal energy. If the specific internal energy is uu then we have a total energy density ^(16){ }^{16}
In non-relativistic fluid dynamics, we had two useful rules, (i) the derivative rule: that we use d//dt\mathrm{d} / \mathrm{d} t in the comoving frame and del//del t+ vec(v)* vec(grad)\partial / \partial t+\vec{v} \cdot \vec{\nabla} otherwise, and (ii) the divergence rule vec(grad)* vec(v)=1//V(dV//dt)\vec{\nabla} \cdot \vec{v}=1 / V(\mathrm{~d} V / \mathrm{d} t). These have memorable relativistic counterparts: (i) a comoving observer measures rates of change using their proper time ^(17)D//dtau{ }^{17} \mathrm{D} / \mathrm{d} \tau, while the natural upgrade for the convective derivative is the covariant derivative grad_(u)\boldsymbol{\nabla}_{\boldsymbol{u}}; (ii) the derivative rule becomes grad*u=(1//V)dV//dtau\boldsymbol{\nabla} \cdot \boldsymbol{u}=(1 / V) \mathrm{d} V / \mathrm{d} \tau.
Example 39.9
Let's examine the rules in more detail. For rule (i), we consider a particle that moves along a world line parametrized by proper time tau\tau which has a tangent vector u\boldsymbol{u}. In the comoving frame in flat space, we recall that the rate of change of a vector quantity like A\boldsymbol{A} is
which, recalling that u\boldsymbol{u} has components (gamma(u),gamma(u) vec(u))(\gamma(u), \gamma(u) \vec{u}), illustrates how the covariant derivative, with its directional property in spacetime, supplies the natural relativistic analogue of our previous expression for the convective derivative.
For rule (ii), we seek an upgrade of the rule to convert between the divergence of velocity and fluid volume whose form, we recall, in non-relativistic Euclidean space was (1)/(V)((d))/((d)t)=grad* vec(v)\frac{1}{V} \frac{\mathrm{~d}}{\mathrm{~d} t}=\nabla \cdot \vec{v}. To justify our proposed upgrade, we note that non-relativistically u^(t)=(dtt)/((d)tau)=1u^{t}=\frac{\mathrm{dt} t}{\mathrm{~d} \tau}=1 and so, motivated by rule (i), we can suggest the compatible expression ^(18){ }^{18}
where u\boldsymbol{u} is the 4 -velocity of the fluid. Since our expressions are valid covariant statements in flat spacetime, they must, by the principle of covariance, be the case in curved spacetime. With this in hand, we can check the conservation equation.
As we've seen previously, the relativistic energy-momentum tensor T\boldsymbol{T} for a perfect fluid is given by
In the fluid's rest frame, we have T^(00)=rho,T^(j0)=0T^{00}=\rho, T^{j 0}=0 and T^(ij)=pdelta^(ik)T^{i j}=p \delta^{i k}. The conservation equation takes the form
With the above concepts in place, we can discuss the key ideas that govern relativistic fluid dynamics.
I: Baryon conservation is perhaps the most primitive conservation law we can specify. It is written in the comoving frame as ^(19){ }^{19}
where the semicolon notation has been employed in the component version on the right. ^(18){ }^{18} Recall that vec(grad)* vec(v)=v^(i)_(,i)\vec{\nabla} \cdot \vec{v}=v^{i}{ }_{, i} and grad*u=\boldsymbol{\nabla} \cdot \boldsymbol{u}=u^(mu)_(;mu)u^{\mu}{ }_{; \mu}. ^(20){ }^{20} Also recall that grad_(u)-=u*grad\nabla_{u} \equiv \boldsymbol{u} \cdot \boldsymbol{\nabla}. ^(21){ }^{21} It's also useful to note that we also have
The vanishing of the final expression, whose coordinate version is (nu^(alpha))_(;alpha)\left(n u^{\alpha}\right)_{; \alpha} is what we wanted to prove. ^(21){ }^{21}
II: Entropy conservation The amount of entropy in a volume VV is S=nsVS=n s V. In a perfect fluid, we assume that there is no heat flow between flow lines, which implies
This can be combined with the conservation of baryons d(nV)//dtau\mathrm{d}(n V) / \mathrm{d} \tau to allow us to conclude that (ds)/((d)tau) >= 0\frac{\mathrm{d} s}{\mathrm{~d} \tau} \geq 0, which is the second law of thermodynamics. In many cases, we consider a perfect fluid, where we have (ds)/((d)tau)=0\frac{\mathrm{d} s}{\mathrm{~d} \tau}=0.
III: Energy conservation is expressed via grad*T=0\boldsymbol{\nabla} \cdot \boldsymbol{T}=0. As in the nonrelativistic example, this powerful equation actually outputs too much in the case that we substitute the expression for the energy-density of the relativistic fluid. That is to say it outputs the energy conservation and also the equation of motion. To restrict the result to energy conservation we use a contracted version of this equation
To break this down further, concentrate on the term u^(alpha)_(;beta)u^(beta)u_(alpha)u^{\alpha}{ }_{; \beta} u^{\beta} u_{\alpha}. Translating back into vector notation, this is rewritten as
This is useful since we can spot that in the same way as for vectors in flat space, where we have v*(dv)/((d)x)=(1)/(2)((d)v^(2))/((d)x)\boldsymbol{v} \cdot \frac{\mathrm{d} v}{\mathrm{~d} x}=\frac{1}{2} \frac{\mathrm{~d} \boldsymbol{v}^{2}}{\mathrm{~d} x}, we have here that
Written in coordinate-free notation, the conclusion from the last example, for the projection of the equation grad*T=0\boldsymbol{\nabla} \cdot \boldsymbol{T}=0 along the velocity direction, is
We shall see very shortly that this expression does indeed describe energy conservation.
Let's link the result of the previous example to energy conservation. In relativity, the first law of thermodynamics may be expressed as ^(23){ }^{23}
{:(39.71)d((" Energy in a volume element ")/(" with fixed number "N" of baryons "))=-pdV+TdS:}\begin{equation*}
\mathrm{d}\binom{\text { Energy in a volume element }}{\text { with fixed number } N \text { of baryons }}=-p \mathrm{~d} V+T \mathrm{~d} S \tag{39.71}
\end{equation*}
where V=N//nV=N / n. We note that the factors of NN are common to all terms, so taking the derivatives with respect to nn and rearranging, we obtain the first law of thermodynamics for our relativistic variables as
{:(39.73)drho=(p+rho)/(n)dn+nTds:}\begin{equation*}
\mathrm{d} \rho=\frac{p+\rho}{n} \mathrm{~d} n+n T \mathrm{~d} s \tag{39.73}
\end{equation*}
Setting ds=0\mathrm{d} s=0 and taking derivatives with respect to the proper time tau\tau we obtain eqn 39.70.
IV: Momentum conservation As in the non-relativistic case, this comes from a projection of grad*T=0\boldsymbol{\nabla} \cdot \boldsymbol{T}=0. This time we project out the part of the divergence perpendicular to the velocity u\boldsymbol{u} using the (0,2)(0,2) projection-operator tensor ^(24)P{ }^{24} \boldsymbol{P} which has components P_(alpha mu)=g_(alpha mu)+u_(alpha)u_(mu)P_{\alpha \mu}=g_{\alpha \mu}+u_{\alpha} u_{\mu} ^(23){ }^{23} Interesting here is that the differential in this expression can be interpreted as the exterior derivative d\boldsymbol{d}. There is no assumption made in the first law about following the flow of the fluid. ^(24){ }^{24} To find the part of a vector v\boldsymbol{v} perpendicular to a velocity vector u\boldsymbol{u} we write
Comparing the action of P\boldsymbol{P} we find P_(alpha mu)v^(mu)=v_(alpha)+(u_(mu)v^(mu))u_(alpha),quad(39.75)P_{\alpha \mu} v^{\mu}=v_{\alpha}+\left(u_{\mu} v^{\mu}\right) u_{\alpha}, \quad(39.75)
which is the down-index version of the simple prescription above ^(25){ }^{25} This can be seen by expanding the (2,0)(2,0) version of P\boldsymbol{P} in the product P*(grad p)\boldsymbol{P} \cdot(\boldsymbol{\nabla} p) in components. Explicitly,
which is the up version of the righthand side of eqn 39.78 , which can also be written as P(grad p\boldsymbol{P}(\boldsymbol{\nabla} p,).Youcan) . You can then quickly check that P(grad p, tilde(u))=0\boldsymbol{P}(\boldsymbol{\nabla} p, \tilde{\boldsymbol{u}})=0, which guarantees that u*grad_(u)u=0\boldsymbol{u} \cdot \boldsymbol{\nabla}_{u} \boldsymbol{u}=0, as usual for an accelerated world line. ^(26){ }^{26} The non-relativistic Euler equation is
This expresses momentum conservation and, as we shall see, results in the equation of motion for the fluid: the relativistic analogue of the Euler equation. The equation of motion is given as an equation for the flow lines.
The result is that the relativistic Euler equation is written as
{:(39.80)(rho+p)grad_(u)u=-P*grad p:}\begin{equation*}
(\rho+p) \boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{u}=-\boldsymbol{P} \cdot \boldsymbol{\nabla} p \tag{39.80}
\end{equation*}
Comparing this to the non-relativistic version ^(26){ }^{26} we remember that the non-relativistic limit requires rho≫p//c^(2)\rho \gg p / c^{2}. Our rule (i) for swapping convective for covariant derivatives is seen on the left-hand side.
Important here is a comparison with the geodesic equation grad_(u)u=\nabla_{\boldsymbol{u}} \boldsymbol{u}= 0 . We see from eqn 39.80 how in the relativistic fluid, it is pressure gradients grad p\nabla p perpendicular to the velocity that causes the flow lines to deviate from describing geodesics. A fluid with a uniform pressure simply describes particles that all follow geodesics. A special case of this is the pressureless dust that we often use to describe the cosmological fluid. So, once more with emphasis: dust is a pressureless fluid where all particles follow geodesics. V\mathbf{V} : Equation of state The thermodynamics of a fluid is described by an equation of state. The nature of the equation of state depends on the circumstances and the resulting natural variables for the problem at hand.
Example 39.15
An example equation of state for the relativistic fluid might be written as
that is, we specify the total energy density as a function of baryon density and specific entropy. With this assignment, we can use the first law (eqn 39.73) to gives us an expression for pressure in terms of energy. Taking the derivative with respect to nn at constant ss leads to
where the natural variables are nn and TT. Again, we can derive an expression for pressure. The pressure (and remember that it is gradients in this quantity that cause flow lines to depart from being geodesics) is generally a first derivative of the free energy. In this case
This completes our review of fluids. One final comment to make is that for the comoving observer, the natural derivative is the Lie derivative described in Chapter 33. This is examined in the exercises.
Chapter summary
Fluid mechanics relies on the velocity field of the fluid or, equivalently its integral curves.
Euler's equation expresses the equations of motion of the fluid; Bernoulli's equation describes the conservation of energy. Both of these are contained in the expression grad*T=0\boldsymbol{\nabla} \cdot \boldsymbol{T}=0.
In relativistic fluid mechanics, Euler's equation expresses the deviation from geodesic behaviour caused by gradients in the pressure perpendicular to the local velocity.
Exercises
39.1) Use Bernoulli's theorem to work out how fast fluid flows from a pipe at the bottom of a reservoir. Give your answer in terms of the heights of the top of the fluid y_(2)y_{2} and the outlet pipe y_(1)y_{1}.
(39.2) In special relativity, we write the components of the velocity vector as u^(mu)=(u^(0),( vec(u)))=(gamma,gamma vec(v))u^{\mu}=\left(u^{0}, \vec{u}\right)=(\gamma, \gamma \vec{v}), where gamma=(1-|( vec(v))|^(2))^(-(1)/(2))\gamma=\left(1-|\vec{v}|^{2}\right)^{-\frac{1}{2}}. The flat space, relativistic version of Bernoulli's equation for steady flow
is given by
where B=gamma(rho+p)//rho_(0)B=\gamma(\rho+p) / \rho_{0}. We shall prove this.
(a) Start by showing that T^(0mu)_(,mu)=0T^{0 \mu}{ }_{, \mu}=0 leads to
(b) Next, use the conservation of rest mass equation, grad*u=-(1)/(rho_(0))*((d)rho_(0))/((d)tau)\boldsymbol{\nabla} \cdot \boldsymbol{u}=-\frac{1}{\rho_{0}} \cdot \frac{\mathrm{~d} \rho_{0}}{\mathrm{~d} \tau}, to prove eqn 39.85 .
(39.3) The speed of sound in a static, perfect fluid. Consider a density wave in flat spacetime, defined by
Hint: The velocity has components u^(mu)=(1,0)u^{\mu}=(1,0) and (delta u)^(mu)=(gamma,gamma delta vec(v))(\delta \boldsymbol{u})^{\mu}=(\gamma, \gamma \delta \vec{v}). However, since delta vec(v)\delta \vec{v} is small, we have gamma~~1\gamma \approx 1 and so (delta u)^(mu)=(1,delta vec(v))(\delta \boldsymbol{u})^{\mu}=(1, \delta \vec{v}).
(39.4) Define the mass current of point particles as
Use the useful identity grad_(mu)A^(mu)=(1)/(sqrt(-g))(del)/(delx^(mu))(sqrt(-g)A^(mu))\nabla_{\mu} A^{\mu}=\frac{1}{\sqrt{-g}} \frac{\partial}{\partial x^{\mu}}\left(\sqrt{-g} A^{\mu}\right) (proved in Exercise 34.3) to show that the mass current is divergenceless.
Hint: Use the distributional identity
(39.5) Using the connection coefficients for the Schwarzschild geometry given in Chapter 21, show that grad*T=0\boldsymbol{\nabla} \cdot \boldsymbol{T}=0 implies that, for a perfect fluid,
Hint: Remember to represent the components of T\boldsymbol{T} in spherical polar coordinates. These can be transformed from the orthonormal-frame version using the appropriate vielbein.
(39.6) In a non-relativistic fluid, conservation of mass is expressed as
The comoving observer is carried along by the fluid's 3 -velocity field vec(v)\vec{v} and measures the change in the mass field rho_(0)V\rho_{0} V. We shall upgrade this into the 3 -volume VV into the language of geometry. We first define a volume element 3 -form tilde(omega)=dx^^dy^^dz\tilde{\boldsymbol{\omega}}=\boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z.
(a) Let's find the Lie derivative £_( vec(v))(rho_(0)( tilde(omega)))£_{\vec{v}}\left(\rho_{0} \tilde{\boldsymbol{\omega}}\right). Show, using components, that
where v\boldsymbol{v} is a vector field. [This is proved in Schutz (1980), whose discussion we follow here.] Use this rule to confirm the result in part (a).
(c) Comparing the result proved in the first parts of the question to the conservation of mass equation equation, show that
See Schutz (1980) for further details.
(39.7) We can write the non-relativistic Euler equation in flat space as (del)/(del t)v_(i)+v^(j)(del)/(delx^(j))v_(i)+(1)/(rho_(0))(del)/(delx^(i))p+(del)/(delx^(i))Phi=0\frac{\partial}{\partial t} v_{i}+v^{j} \frac{\partial}{\partial x^{j}} v_{i}+\frac{1}{\rho_{0}} \frac{\partial}{\partial x^{i}} p+\frac{\partial}{\partial x^{i}} \Phi=0.
Show that, with tilde(v)=v_(i)dx^(i)\tilde{\boldsymbol{v}}=v_{i} \mathrm{~d} x^{i}, this can be rewritten in terms of Lie and exterior derivatives as
(39.8) If vec(v)\vec{v} is a uniform velocity field for a fluid in Euclidean space, find a 2 -form tilde(f)\tilde{\boldsymbol{f}} that outputs the flux of the fluid through a parallelogram formed by two displacements vec(a)\vec{a} and vec(b)\vec{b}.
(39.9) We can use fluid mechanics to compute the behaviour of the cosmological fluid in the Einstein-de Sitter universe (Universe 3 in Chapter 18), whose dynamics we can assume is Newtonian.
(a) Show that for a Hubble-law expansion with vec(v)=(a^(˙)//a) vec(r)\vec{v}=(\dot{a} / a) \vec{r} we expect a non-relativistic Euler equation of
(b) Show that this is consistent with (i) the Friedmann equations, and (ii) a Poisson equation with potential Phi=2pi rhor^(2)//3\Phi=2 \pi \rho r^{2} / 3.
Now consider a perturbation
(h) Demonstrate that the equation has two solutions: one that decays as 1//t^(')1 / t^{\prime} and another that grows as t^('2//3)t^{\prime 2 / 3}.
This is important as it shows that any perturbation to this universe either decays or grows very slowly.
↷\curvearrowright The material in this chapter is useful for understanding inflation (Chapter 41) and electromagnetism (Chapter 42). It also forms the basis for string theory (Chapter 49). ^(1){ }^{1} David Hilbert (1862-1943) was one of the most influential mathematicians of the nineteenth and twentieth centuries. An example of his influence was the collection of mathematical problems, presented in 1900, that largely set the agenda for research in the twentieth century. The Einstein-Hilbert action was proposed by Hilbert in 1915. ^(2){ }^{2} Einstein developed general relativity between 1907 and 1915. He arrived at his field equations in late 1915. The history is presented in Pais' Subtle is the Lord (2005), and also in Cheng's Einstein's Physics (2013). As pointed out by Ohanian and Ruffini, Einstein gave us general relativity early: we really only deserved it around twenty years later when the relativistic gauge-field techniques discussed in this part of the book had matured.
Lagrangian field theory
The reader will find no figures in this work. The methods which I set forth do not require either constructions or geometrical or mechanical reasonings: but only algebraic operations, subject to a regular and uniform rule of procedure. Those who love Analysis will see with pleasure that Mechanics has become a branch of it, and will be grateful to me for having thus extended its domain.
Joseph Lagrange (1736-1813) Mécanique Analytique
A classical field can be thought of as a machine whose input is a position xx in spacetime and whose output is the amplitude of the field at that point. General relativity is a classical field theory and so, like all field theories, we should therefore be able to deal with it using the machinery of Lagrangians. This approach has many advantages, not least that it represents a systematic means of deriving information about the fields and their interactions. Previously, for example, we have had to make educated guesses about the shape and content of the energymomentum tensor T\boldsymbol{T}. This is rather unsatisfactory given the importance of T\boldsymbol{T} in supplying the right-hand side of the Einstein field equation. In contrast to this ad-hoc approach, a Lagrangian field theory has an inbuilt, turn-the-handle recipe for deriving the energy-momentum tensor. We need only supply the matter fields and the Lagrangian-field-theory machine outputs a suitable field T(x)\boldsymbol{T}(x) containing all of the information about the energy content of the fields we have inputted. In this chapter, we rederive the Einstein equation systematically by following the Lagrangian method. When general relativity was being formulated, David Hilbert ^(1){ }^{1} searched for a Lagrangian formulation that would describe gravitating systems. He found it (and it is now named in his honour) but only after Einstein had found his field equations via a less systematic method, narrowly beating Hilbert to the solution. ^(2){ }^{2}
Before starting, it is worth a reminder of the fields with which we are concerned. General relativity couples the geometry of spacetime to the matter fields of the Universe. One class of fields is therefore the matter fields that describes the distribution of both massive and massless matter throughout spacetime. (This class of matter fields also include the 1 -form field tilde(A)(x)\tilde{\boldsymbol{A}}(x) that describes electromagnetism.) These matter fields go together to form T(x)\boldsymbol{T}(x). The other class has a single member: it is the field that governs the curvature of spacetime. This is the (0,2)(0,2)
metric field g(x)\boldsymbol{g}(x). A Lagrangian field theory version of general relativity must couple these fields.
Let's now build a general Lagrangian formulation of classical field theory. We do this in such a way as to maximize the resemblance of field theory to the Lagrangian formulation of classical particle mechanics, that we have already met. Our procedure will involve writing down a Lagrangian that describes the metric field, the matter fields and their interactions. The Lagrangian can then be fed through a set of EulerLagrange equations to output a generalized form of the Einstein field equations.
40.1 Matter fields
Every particle in the Universe is an excitation of a matter field. This statement is the basis of quantum field theory (QFT), which describes the properties of matter fields on a quantum level. It does this by taking a classical description of the matter fields and quantizing it. That is, it turns the fields into quantum-mechanical operators that have commutation relations with each other. The quantum-field operators act on quantum states, creating or annihilating excitations. ^(3){ }^{3}
General relativity deals with classical fields, without the step of quantization described above. However, we still start by following the same route as is followed in QFT, where we write down a description of the matter in the Universe in terms of classical matter fields. We might wonder what the fields represent. We've said that a field can be thought of as a machine that inputs a position in spacetime and outputs an amplitude; in a classical field theory this amplitude typically describes the mass-energy density of matter.
Example 40.1
We shall meet several kinds of classical field in this chapter.
A scalar field inputs a position in spacetime xx and outputs a scalar amplitude that describes the distribution of matter. Specifically, the real scalar field phi(x)\phi(x) gives the probability amplitude for finding matter at point vec(x)\vec{x} at a time tt, such that the probability of finding a particle in a region dSigma\mathrm{d} \Sigma centred on vec(x)\vec{x} is phi(x)^(2)dSigma\phi(x)^{2} \mathrm{~d} \Sigma, where dSigma\mathrm{d} \Sigma probability of finding a particle in a region dSigma\mathrm{d} \Sigma centred on vec(x)\vec{x} is phi(x)^(2)dSigma\phi(x)^{2} \mathrm{~d} \Sigma, where dSigma\mathrm{d} \Sigma
is an element of 3 -volume. This field is quantized in QFT to describe excitations in is an element of 3 -volume. This field is quantiz
the field, which are massive, spinless particles.
the field, which are massive, spinless particles.
A vector field or 1 -form field inputs a position in spacetime and outputs a vector-- A vector field or 1\mathbf{1}-form field inputs a position in spacetime and outputs a vector-
valued quantity or 1 -form valued quantity, respectively. An important example is the electromagnetic 1-form gauge field tilde(A)(x)\tilde{\boldsymbol{A}}(x) that can be used to determine the electric and magnetic fields at a point in spacetime. It is quantized in QFT into photons, the massless particle excitations of light.
In general relativity, we most often deal with the mass density field rho(x)\rho(x) for a perfect fluid. This tells us the density of fluid at a point xx in spacetime, such that the mass of fluid in a volume dSigma\mathrm{d} \Sigma centred at vec(x)\vec{x} at a particular time is rho(x)dSigma\rho(x) \mathrm{d} \Sigma. This is not a field that features in most standard treatments of QFT.
Knowing a field locally (that is, at our position in spacetime xx ) gives us access to the distribution of matter described by the field close to us (that ^(3){ }^{3} See Lancaster and Blundell, Quantum Field Theory for the Gifted Amateur (2014), for a full account.
↷\curvearrowright Chapter 49 contains a discussion of approaches to quan-
tum gravity.
^(4){ }^{4} Some often-used notation is that, in flat space, partial derivatives with respect to a coordinate x^(mu)x^{\mu} are often written as
In comma notation, we would write del_(mu)phi-=phi_(,mu)\partial_{\mu} \phi \equiv \phi_{, \mu}. The advantage of this latter notation is that in curved space, commas can be upgraded to semicolons to signify a covariant partial derivative. ^(5){ }^{5} Lagrangian density is usually shortened to 'Lagrangian' to save effort, which we'll do from here.
is, the fields are local). If we also have the field's equation of motion, we shall be able to make predictions about the behaviour of the field at other points in spacetime, and the consequences of its interactions with other fields. The first step towards finding this equation is to write down the action that describes the system, and it is to this we now turn.
40.2 Action and equations of motion
We have met the action of a system in particle mechanics. Mathematically the action is a functional S[]S[] : a machine that inputs a function phi(x)\phi(x) (or possibly several functions), and outputs a number S[phi]S[\phi] with units of (energy) xx\times (time). Classical particle mechanics involves writing the action SS of a system in terms of the coordinates q_(i)q_{i} of the particles involved and their time derivatives q^(˙)_(i)\dot{q}_{i}. The action is found using the Lagrangian L(q_(i),q^(˙)_(i))L\left(q_{i}, \dot{q}_{i}\right) which is a function that we integrate with respect to time to obtain the action itself. Specifically, for NN particles we have
In particle mechanics, the action is simply the difference between the kinetic and potential energies of the system.
We now start upgrading this scheme to work on fields. In flat, threedimensional space, the Lagrangian is related to the Lagrangian density L\mathcal{L} by
{:(40.2)L=intd^(3)xL:}\begin{equation*}
L=\int \mathrm{d}^{3} x \mathcal{L} \tag{40.2}
\end{equation*}
so that
{:(40.3)S=intd^(4)xL:}\begin{equation*}
S=\int \mathrm{d}^{4} x \mathcal{L} \tag{40.3}
\end{equation*}
The Lagrangian is so important that we often use a shorthand and call a Lagrangian or action that describes fields and their interaction a field theory.
Classical field theory involves writing the action SS of a system in terms of the matter field phi\phi involved and its derivatives with respect to space and time. ^(4){ }^{4} This allows us to identify a Lagrangian density for the matter fields L_(m)(phi,del_(mu)phi)\mathcal{L}_{\mathrm{m}}\left(\phi, \partial_{\mu} \phi\right) : a function of the field and its derivatives that we integrate over all space and time to obtain the action. In flat space, we have, therefore, that
{:(40.5)S[phi]=intd^(4)xL_(m)(phi,del_(mu)phi):}\begin{equation*}
S[\phi]=\int \mathrm{d}^{4} x \mathcal{L}_{\mathrm{m}}\left(\phi, \partial_{\mu} \phi\right) \tag{40.5}
\end{equation*}
The Lagrangian density of matter fields is analogous to the Lagrangian in particle mechanics, it reflects the difference in kinetic and potential energy densities of a field. In order to describe the matter fields in the Universe, we need to be able to write their Lagrangian densities in terms of the fields and their derivatives. ^(5){ }^{5}
Example 40.2
Consider waves on a string of mass mm and length ℓ\ell. We define the mass density rho=m//ℓ\rho=m / \ell, tension T\mathcal{T} and displacement psi(x,t)\psi(x, t) from the equilibrium (see Fig. 40.1). The kinetic energy TT can then be written as T=(1)/(2)int_(0)^(ℓ)dx rho(del psi//del t)^(2)T=\frac{1}{2} \int_{0}^{\ell} \mathrm{d} x \rho(\partial \psi / \partial t)^{2} and the potential energy V=(1)/(2)int_(0)^(ℓ)dxT(del psi//del x)^(2)V=\frac{1}{2} \int_{0}^{\ell} \mathrm{d} x \mathcal{T}(\partial \psi / \partial x)^{2}. The action is then
{:[(40.6)qquad S[psi(x","t)]=intdt(T-V)=intdtdxL_(m)(psi,(del psi)/(del t),(del psi)/(del x))],[(40.7)" where "],[qquadL_(m)(psi,(del psi)/(del t),(del psi)/(del x))=(rho)/(2)((del psi)/(del t))^(2)-(T)/(2)((del psi)/(del x))^(2)],[" is the Lagrangian density. "]:}\begin{align*}
& \qquad S[\psi(x, t)]=\int \mathrm{d} t(T-V)=\int \mathrm{d} t \mathrm{~d} x \mathcal{L}_{\mathrm{m}}\left(\psi, \frac{\partial \psi}{\partial t}, \frac{\partial \psi}{\partial x}\right) \tag{40.6}\\
& \text { where } \tag{40.7}\\
& \qquad \mathcal{L}_{\mathrm{m}}\left(\psi, \frac{\partial \psi}{\partial t}, \frac{\partial \psi}{\partial x}\right)=\frac{\rho}{2}\left(\frac{\partial \psi}{\partial t}\right)^{2}-\frac{\mathcal{T}}{2}\left(\frac{\partial \psi}{\partial x}\right)^{2} \\
& \text { is the Lagrangian density. }
\end{align*}
In most cases we assume a form for the Lagrangian such as those discussed in the next example.
Example 40.3
A free scalar field has the Lagrangian
This can be interpreted as the difference between a kinetic energy density TT and a potential energy density VV. The first term is the kinetic energy, proportional to the time derivative of the field. The second is the potential energy cost for a field changing in space. The third term represents the energy cost of the field existing in spacetime at all.
It is important to note that, if there is more than one field present, we must write the action for all of the fields in terms of the sum of their Lagrangians along with a Lagrangian for any interactions between the fields. So if we have two fields phi\phi and psi\psi then the action is written S=intd^(4)xL=intd^(4)x[L_(m)(phi,phi_(,mu))+L_(m)(psi,psi,mu)+L_("int ")(phi,psi)],quadS=\int \mathrm{d}^{4} x \mathcal{L}=\int \mathrm{d}^{4} x\left[\mathcal{L}_{\mathrm{m}}\left(\phi, \phi_{, \mu}\right)+\mathcal{L}_{\mathrm{m}}(\psi, \psi, \mu)+\mathcal{L}_{\text {int }}(\phi, \psi)\right], \quad (40.11) where L_("int ")\mathcal{L}_{\text {int }} encodes the interaction between the fields. This can also be written as a sum of actions
As for particles, the equations of motion of a field are found by extremizing the total action by varying the fields and setting the variation in the action to zero. Extremizing the action SS amounts to the condition ^(6){ }^{6}delta S=delta intd^(4)xL=intd^(4)x deltaL=0\delta S=\delta \int \mathrm{d}^{4} x \mathcal{L}=\int \mathrm{d}^{4} x \delta \mathcal{L}=0, where the variation in the Lagrangian can be calculated by using the well-known rule for differentials. ^(7){ }^{7}
Fig. 40.1 Displacements of a string. The displacement from equilibrium is psi(x,t)\psi(x, t) and the equation of motion can be derived by considering an element of the string of length dx\mathrm{d} x and mass rhodx\rho \mathrm{d} x. The figure shows a short section in the middle of the string which is assumed middle of the string which is assumed
to be tethered at either end so that to be tethered psi(0,t)=psi(ℓ,t)\psi(0, t)=\psi(\ell, t). ^(6){ }^{6} This only works assuming that the integration measure doesn't change when we vary the matter field. This latter point is where the complication comes in gravitation, since we vary action with respect to the components of the metric field, which in turn changes the integration measure. ^(7){ }^{7} For a function f(x,y)f(x, y) we write delta f=((del f)/(del x))_(y)delta x+((del f)/(del y))_(x)delta y\delta f=\left(\frac{\partial f}{\partial x}\right)_{y} \delta x+\left(\frac{\partial f}{\partial y}\right)_{x} \delta y. (40.13) ^(8){ }^{8} Recall that for a particle the EulerLagrange equation is
or as del^(2)psi=0\partial^{2} \psi=0 for short. In Chapter 46, we shall see how excitations in the metric field obey this equation, giving us gravitational waves.
Example 40.4
We vary the action SS by changing the fields within a (3+1)-dimensional region D\mathcal{D} of flat spacetime, subject to the condition that the variation of the fields delta phi\delta \phi vanishes at the boundary of D\mathcal{D}. The variation delta S\delta S yields
where dsigma\mathrm{d} \sigma labels elements of the (three-dimensional) surface on the boundary delD\partial \mathcal{D} of our region D\mathcal{D}. This integral must be zero since delta phi\delta \phi vanishes at the boundary delD\partial \mathcal{D}. Collecting the remaining terms from eqns 40.14 and 40.15 we see that
and so the wave equation (del^(2)psi//delx^(2))=(1//v^(2))(del^(2)psi//delt^(2))\left(\partial^{2} \psi / \partial x^{2}\right)=\left(1 / v^{2}\right)\left(\partial^{2} \psi / \partial t^{2}\right) with v=sqrt(T//rho)v=\sqrt{\mathcal{T} / \rho} emerges almost effortlessly. ^(9){ }^{9} We say that the field psi\psi hosts wave-like excitations.
Example 40.6
For the free scalar field in Minkowski space from Example 40.3 we find (delL_(m))/(del phi)=-m^(2)phi,quad(delL_(m))/(delphi_(,mu))=-del^(mu)phi\frac{\partial \mathcal{L}_{m}}{\partial \phi}=-m^{2} \phi, \quad \frac{\partial \mathcal{L}_{m}}{\partial \phi_{, \mu}}=-\partial^{\mu} \phi,
leading to the equation of motion (-del^(2)+m^(2))phi=0\left(-\partial^{2}+m^{2}\right) \phi=0,
where del^(2)phi=del_(mu)del^(mu)=g^(mu nu)del_(mu)del_(nu)=-(del^(2)phi)/(delt^(2))+(del^(2)phi)/(delx^(2))+(del^(2)phi)/(dely^(2))+(del^(2)phi)/(delx^(2))\partial^{2} \phi=\partial_{\mu} \partial^{\mu}=g^{\mu \nu} \partial_{\mu} \partial_{\nu}=-\frac{\partial^{2} \phi}{\partial t^{2}}+\frac{\partial^{2} \phi}{\partial x^{2}}+\frac{\partial^{2} \phi}{\partial y^{2}}+\frac{\partial^{2} \phi}{\partial x^{2}}. The equation of motion is known as the Klein-Gordon equation.
40.3 Fields in curved spacetime
When spacetime is curved, the usual flat-space volume element d^(4)x\mathrm{d}^{4} x is replaced by the invariant element ^(10)sqrt(-g)d^(4)x{ }^{10} \sqrt{-g} \mathrm{~d}^{4} x. For a scalar matter field phi\phi, the action functional is given by
When spacetime is curved the partial derivative del_(mu)\partial_{\mu} is upgraded to the covariant derivative grad_(mu)\boldsymbol{\nabla}_{\mu}. For a scalar we note that del_(mu)phi=grad_(mu)phi\partial_{\mu} \phi=\boldsymbol{\nabla}_{\mu} \phi and so our previous equation for the action remains valid with L_(m)(phi,phi_(,mu))rarr\mathcal{L}_{\mathrm{m}}\left(\phi, \phi_{, \mu}\right) \rightarrowL_(m)(phi,phi_(;mu))\mathcal{L}_{\mathrm{m}}\left(\phi, \phi_{; \mu}\right). We might guess that with the inclusion of the covariant derivative, the Euler-Lagrange equations for scalar fields should look like
It does.
When applied to vector or tensor fields, the Euler-Lagrange equation outputs the equations of motion for each component of a field. This is because the Lagrangian machinery works in terms of components. In generalizing the above equation, we therefore need a notation that picks out the components of the covariant derivative. This is the useful semicolon notation, ^(11){ }^{11} which allows us to write the Euler-Lagrange equation for scalar fields as
Dealing with the equations of motion of matter fields is fairly straightforward in curved spacetime. The complication compared to Minkowski space being that we must be careful to include the components of the metric field in our manipulations. ^(12){ }^{12}
Example 40.7
For the scalar field in curved spacetime we have the Lagrangian density
We have delL_(m)//del phi=-m^(2)phi\partial \mathcal{L}_{\mathrm{m}} / \partial \phi=-m^{2} \phi. Differentiating L\mathcal{L} with respect to grad_(mu)phi=phi_(;mu)\nabla_{\mu} \phi=\phi_{; \mu} gives us
^(10){ }^{10} The volume 4 -form, written in the coordinate frame, is tilde(omega)=sqrt(-g)dt()^^dx()^^dy()^^dz()\tilde{\boldsymbol{\omega}}=\sqrt{-g} \boldsymbol{d} t() \wedge \boldsymbol{d} x() \wedge \boldsymbol{d} y() \wedge \boldsymbol{d} z().
Inserting Cartesian infinitesimal vectors dte_(t),dxe_(x),dye_(y)\mathrm{d} t \boldsymbol{e}_{t}, \mathrm{~d} x \boldsymbol{e}_{x}, \mathrm{~d} y \boldsymbol{e}_{y} and dze_(z)\mathrm{d} z \boldsymbol{e}_{z}, yields the volume element sqrt(-g)dtdxdydz\sqrt{-g} \mathrm{~d} t \mathrm{~d} x \mathrm{~d} y \mathrm{~d} z. ^(11){ }^{11} Remember A^(nu)_(;mu)=del_(mu)A^(nu)+Gamma^(nu)_(mu beta)A^(beta)A^{\nu}{ }_{; \mu}=\partial_{\mu} A^{\nu}+\Gamma^{\nu}{ }_{\mu \beta} A^{\beta} and A_(nu;mu)=del_(mu)A_(nu)-Gamma^(beta)_(mu nu)A_(beta)A_{\nu ; \mu}=\partial_{\mu} A_{\nu}-\Gamma^{\beta}{ }_{\mu \nu} A_{\beta}. ^(12){ }^{12} Here it is helpful to remember that the partial derivative del_(mu)\partial_{\mu} is naturally lowered, which is to say that the components are in the down position. This is also true of the covariant derivative grad_(mu)\boldsymbol{\nabla}_{\mu}. We therefore make products of grad_(mu)\nabla_{\mu}. We therefore make products of
these derivatives using the all-up comthese derivatives using the all-up com-
ponents of the (0,2)(0,2) metric g^(mu nu)g^{\mu \nu}. It is ponents of the (0,2)(0,2) metric g^(mu nu)g^{\mu \nu}. It is
also worth noting (since we usually deal also worth noting (since we usually deal
with the all-down components of the with the all-down components of the
metric gg ) that it is only in the case that the metric is diagonal that we have g_(mu nu)=1//g^(mu nu)g_{\mu \nu}=1 / g^{\mu \nu}. ^(13){ }^{13} The semicolon notation is especially useful here. Explicitly we have that
^(14){ }^{14} The constant is used here to ensure that however we have defined the matter fields, when combined with the Einstein-Hilbert action they correctly produce the 8pi8 \pi term in the Einstein equation G=8pi T\boldsymbol{G}=8 \pi \boldsymbol{T}. ^(15){ }^{15} We state this without proof, but the inclusion of the Ricci scalar field R(x)R(x) is perhaps not completely unexpected By the principle of general covariance, we would be looking for a field which transforms properly, and we might ex pect something fairly simple and fundamental to appear there. As we shall see though, the proof of the pudding is that it produces the right answer.
We therefore have a systematic means of dealing with matter fields in curved spacetime.
40.4 Motivating the Einstein equation
How do we deal with the geometry of spacetime, expressed via g(x)\boldsymbol{g}(x) in the Lagrangian formalism? This is the problem that was solved by David Hilbert, with the solution (now called) the Einstein-Hilbert action S_(EH)S_{\mathrm{EH}}. The idea is that we need to add this new term S_("EH ")S_{\text {EH }} that reflects the g(x)\boldsymbol{g}(x) field to the action S_(m)S_{\mathrm{m}} of the system of matter fields. The total action is then
where alpha\alpha is a constant ^(14){ }^{14} and S_(m)S_{\mathrm{m}} contains the contributions from all of the matter fields in the Universe. The equation of motion (a.k.a. the Einstein equation) then results from the variation of the total action, found by setting delta S=0\delta S=0. The variation of the matter fields should be expected to supply the right-hand side of the Einstein equation reflecting the massenergy in the Universe, while the variation of the Einstein-Hilbert action should give us the geometrical left-hand side.
A complication here is that, unfortunately, we can't simply guess a total Lagrangian and feed this into the Euler-Lagrange equations to extract the Einstein equation. This is because the field that we must vary, subject to the condition that delta S=0\delta S=0, is the metric field g(x)\boldsymbol{g}(x). The variation therefore has the effect of changing the intervals between points in spacetime itself and, in particular, changing the volume element sqrt(-g)d^(4)x\sqrt{-g} \mathrm{~d}^{4} x in the action interval. We must therefore vary the total action with respect to the components of the field g_(mu nu)g_{\mu \nu} of the metric from first principles, such that delta S\delta S vanishes. It is also important that we vary the action with respect to g_(mu nu)g_{\mu \nu} and not the all-up components g^(mu nu)g^{\mu \nu}. (This is because the gravitational field is most directly represented by the (0,2)(0,2) tensor field g(\boldsymbol{g}(,)inwhichweinputvectors.)Wethereforeseek) in which we input vectors.) We therefore seek
After all of these preliminaries, we are now ready to meet Hilbert's famous result. ^(15){ }^{15} The Lagrangian density for geometry is given by the Ricci scalar field R(x)R(x), such that the Einstein-Hilbert action is written as
{:(40.36)S_(EH)=intd^(4)xsqrt(-g(x))R(x):}\begin{equation*}
S_{\mathrm{EH}}=\int \mathrm{d}^{4} x \sqrt{-g(x)} R(x) \tag{40.36}
\end{equation*}
This is added to the matter action alphaS_(m)=intd^(4)xsqrt(-g(x))alphaL_(m)\alpha S_{\mathrm{m}}=\int \mathrm{d}^{4} x \sqrt{-g(x)} \alpha \mathcal{L}_{\mathrm{m}}, where alpha\alpha is the constant discussed above. Varying this total Lagrangian with
respect to the components of the metric results in the Einstein equations describing the dynamics that link matter fields and geometrical fields.
In order to vary the geometry Lagrangian in eqn 40.36, we shall need three new mathematical tricks: two are rules for the variation of the metric tensor and a third is for varying the Riemann tensor. The first comes from our need to vary the action with respect to the components g_(mu nu)g_{\mu \nu}. It turns out that this forces us to also consider the result of varying components g^(mu nu)g^{\mu \nu}. The rule we need is that there's an extra minus sign relating the variation of the up and down versions of the vector
Example 40.8
The identity can be proven by considering the variation of the matrix equation M_M_^(-1)=I\underline{\boldsymbol{M}} \underline{\boldsymbol{M}}^{-1}=I to give
Note the important minus sign! Representing the components of the metric as a matrix with M__(mu nu)=g_(mu nu)\underline{\boldsymbol{M}}_{\mu \nu}=g_{\mu \nu} and (M_^(-1))_(mu nu)=g^(mu nu)\left(\underline{\boldsymbol{M}}^{-1}\right)_{\mu \nu}=g^{\mu \nu}, we must then have
To prove this we use the identity ^(16)ln detM_=Tr ln M_{ }^{16} \ln \operatorname{det} \underline{M}=\operatorname{Tr} \ln \underline{M}. Differentiate the identity to find
Plugging in the metric leads to the equation claimed above.
The third and final rule we shall need is the Palatini identity ^(17){ }^{17} which says that the variation of components of the Ricci tensor R_(alpha beta)R_{\alpha \beta} can be written in terms of covariant derivatives of the variation of the connection coefficients. The identity reads
^(16){ }^{16} The proof of this identity, which can be found in many sources, is left as an exercise. Alternatively, one can follow Exercise 34.3. ^(17){ }^{17} Attilio Palatini (1889-1949) showed that the variations of the connection coefficients are the coordinate components of a tensor. We won't reproduce the derivation here, but note that the Palatini identity in eqn 40.43 is proved in Zee (2013).
With these three identities under our belts, we can vary the EinsteinHilbert action. There's nothing for it now but to plunge into the algebra.
Example 40.10
^(18){ }^{18} That is, we start from R=g^(alpha beta)R_(alpha beta)R=g^{\alpha \beta} R_{\alpha \beta}. Written out in terms of components of the Ricci tensor, ^(18){ }^{18} we have that the EinsteinHilbert action comprises three objects
The result in terms of deltag_(mu nu)\delta g_{\mu \nu} and R^(mu nu)R^{\mu \nu} confirms that writing deltaI_(2)\delta I_{2} in terms of R_(sigma rho)R_{\sigma \rho} and not R^(sigma rho)R^{\sigma \rho} was the correct choice: the latter would not have had indices consistent with the term from deltaI_(3)\delta I_{3}.
The most problematical part is the final integral for deltaI_(1)\delta I_{1}. Fortunately, this only contributes a surface term which vanishes. The key to seeing this is to use the Palatini identity (Rule 3) which says
using grad_(alpha)g^(mu nu)=0\nabla_{\alpha} g^{\mu \nu}=0. Finally, we use the divergence theorem to say that the derivative may be written as a surface integral over the boundary. This vanishes because the variation vanishes on the boundary by definition and so this term does not contribute. The result is that the contribution to the equations of motion is
The variation of the action should vanish, so we must have that delta S=deltaI_(2)+deltaI_(3)=0\delta S=\delta I_{2}+\delta I_{3}=0. The way to guarantee this is that the integrand must vanish.
The conclusion of this lengthy set of manipulations is that the components of the Einstein tensor G^(mu nu)=(R^(mu nu)-(1)/(2)g^(mu nu)R)G^{\mu \nu}=\left(R^{\mu \nu}-\frac{1}{2} g^{\mu \nu} R\right) emerge as the contribution to the (Einstein) equation of motion and that they must vanish. This gives the Einstein equation in the slightly stripped-down form R^(mu nu)-(1)/(2)g^(mu nu)R=0R^{\mu \nu}-\frac{1}{2} g^{\mu \nu} R=0. This simplified form arises because we have not yet included the matter fields in the action. The matter fields must therefore spit out the energy-momentum tensor, and so we now turn to examine how the energy-momentum tensor can be derived from the action.
40.5 Energy-momentum tensor
Often the energy-momentum tensor is determined ad hoc, using all of the physical insight we can muster. However, field theory tells us that there should be a way to simply and mechanically extract it from the Lagrangian. There is, as we see in the next example.
Example 40.11
We can gain access to the energy-momentum tensor if we vary the action with respect to the components of the metric. The action is given by
{:(40.51)S=S_(EH)+alphaS_(m)=intd^(4)xsqrt(-g)(L_(EH)+alphaL_(m)):}\begin{equation*}
S=S_{\mathrm{EH}}+\alpha S_{\mathrm{m}}=\int \mathrm{d}^{4} x \sqrt{-g}\left(\mathcal{L}_{\mathrm{EH}}+\alpha \mathcal{L}_{\mathrm{m}}\right) \tag{40.51}
\end{equation*}
If we vary with respect to the components of the metric, we obtain
where we have used the components of the Einstein tensor G^(mu nu)=R^(mu nu)-(1)/(2)g^(mu nu)RG^{\mu \nu}=R^{\mu \nu}-\frac{1}{2} g^{\mu \nu} R.
Comparing eqn 40.52 with the Einstein equation, we make the suggestion
We choose alpha=16 pi\alpha=16 \pi, which leads to the Einstein equation G_(mu nu)=8piT_(mu nu)G_{\mu \nu}=8 \pi T_{\mu \nu}, which is what the variation must produce. We conclude that the action
{:(40.56)S=intd^(4)xsqrt(-g)[R(x)+16 piL_(m)]:}\begin{equation*}
S=\int \mathrm{d}^{4} x \sqrt{-g}\left[R(x)+16 \pi \mathcal{L}_{\mathrm{m}}\right] \tag{40.56}
\end{equation*}
when varied with respect to the components of the metric, leads to the Einstein equation.
Expanding eqn (40.54), we have a final prescription for calculating the energy-momentum tensor from the matter action:
We use this below. However, because the derivative operator del_(mu)\partial_{\mu} is naturally an object with a down index, the action for matter fields is most simply written in terms of a derivative with respect to g^(mu nu)g^{\mu \nu}. It's then easiest to use an analogous expression for the all-down components of T\boldsymbol{T}, which is ^(19){ }^{19}
which has timelike component T_(00)=(1)/(2)(del_(0)phi)^(2)+(1)/(2)( vec(grad)phi)^(2)+(m^(2))/(2)*phi^(2)T_{00}=\frac{1}{2}\left(\partial_{0} \phi\right)^{2}+\frac{1}{2}(\vec{\nabla} \phi)^{2}+\frac{m^{2}}{2} \cdot \phi^{2}, which simply reflects the sum of kinetic energy and potential energy.
Electromagnetism is described by the Lagrangian
The previous example shows how we can extract the energy-momentum tensor for the fields in the Universe. An exception is gravitation itself. In the same way that general relativity removes the notion of a gravitational force, it also prevents us from identifying an energy-moment tensor for a gravitational field.
40.6 Noether's theorem
One of the most profound statements in field theory is Noether's the- ^(20){ }^{20} Amalie Emmy Noether (1882-1935). orem. ^(20){ }^{20} It says that if a field theory has a symmetry then there is a corresponding conservation equation corresponding to a conserved quantity. Noether's theorem elegantly reveals these conservation equations and will consider how it can be used on the metric field.
Consider the differential of the Lagrangian in Minkowski space
If we make a continuous transformation of the variables in the La- ^(21){ }^{21} Strictly, if it doesn't change up to a surface term. grangian and find that the action doesn't change ^(21){ }^{21} we say that the transformation represents a symmetry. In order to examine the sym- metries corresponding to the geometric features of the theory, we are
going to pull this Lagrangian along a vector xi\boldsymbol{\xi} and compare the result with our original Lagrangian.
The mathematical tool for comparing a field after it has been transported by a vector is, of course, the Lie derivative. We therefore upgrade the differential deltaL\delta \mathcal{L} to a Lie derivative
We have seen that in examining geometry (encoded in the metric field), symmetries are described in terms of isometries in the metric, which give rise to Killing fields. So we set xi\boldsymbol{\xi} to be a Killing field, defined by £_(xi)eta_(alpha beta)=0£_{\xi} \eta_{\alpha \beta}=0, which wipes out the final term in eqn 40.69 . We can then use the Euler-Lagrange equation ^(22){ }^{22} to substitute for delL//del phi\partial \mathcal{L} / \partial \phi. Rearranging then gives
This is a conservation equation ^(23){ }^{23} that tells us the quantity in the bracket is conserved. This is Noether's theorem: a symmetry (identified via the Killing vector xi\boldsymbol{\xi} ) has given rise to a conserved Noether current J_(N)\boldsymbol{J}_{\mathrm{N}} with components
We can group the four Noether currents that result from translations into a single tensor S\boldsymbol{S}. We call this quantity the canonical energymomentum tensor. It has components
so that del_(alpha)S^(alpha beta)=0\partial_{\alpha} S^{\alpha \beta}=0.
The tensors S\boldsymbol{S} and the energy-momentum tensor T\boldsymbol{T} are the same for scalar fields, but differ for more complicated matter fields. Generally, SS isn't symmetrical and doesn't generalize for curved spacetimes, so is generally unsuitable for incorporation into Einstein's equation. ^(24){ }^{24} ^(22){ }^{22} The Euler-Lagrange equation is
^(23){ }^{23} This is an equation of continuity, telling us that the divergence of something is zero. In some cases, we may thing is zero. In some cases, we may
have to worry about the global boundhave to worry about the global bound-
ary conditions to show that there is no effect of a surface term. We will not worry about that here. ^(24){ }^{24} The need for a symmetrical energymomentum tensor is discussed in the exercises. ↷\curvearrowright Since the algebra is rather involved, this argument is left out of most books. The techout of most books. The tech-
nical detail is all contained in Example 40.14, which can be Example 40.14, which can be skipped if so desired.
40.7 The perfect fluid
Fluids were examined in detail in Chapter 39. Here we show how the perfect fluid's equation of motion and energy-momentum tensor can be derived directly from a Lagrangian. The perfect fluid has no viscosity and moves adiabatically. In field theory, a fluid is described by two features: (i) its mass-density field rho_(0)(x)\rho_{0}(x) and (ii) a congruence of curves called flow lines (or stream lines) with timelike tangent vectors w=(del)/(del tau)\boldsymbol{w}=\frac{\partial}{\partial \tau}. When normalized, the tangent w\boldsymbol{w} becomes the velocity v\boldsymbol{v} field of the fluid. That is to say
so that v^(2)=g(v,v)=-1\boldsymbol{v}^{2}=\boldsymbol{g}(\boldsymbol{v}, \boldsymbol{v})=-1, as we require for a velocity field. The mass current is J=rho_(0)v\boldsymbol{J}=\rho_{0} \boldsymbol{v} and we impose the condition that grad*J=0\boldsymbol{\nabla} \cdot \boldsymbol{J}=0 or equivalently J^(alpha)_(;alpha)=0J^{\alpha}{ }_{; \alpha}=0, which is to say that the current is conserved. The fluid has a Lagrangian
where uu is the specific internal energy, which is a function of rho_(0)\rho_{0} [that is {:u=u(rho_(0))].^(25)\left.u=u\left(\rho_{0}\right)\right] .{ }^{25} The quantity rho=rho_(0)(1+u)\rho=\rho_{0}(1+u) the total energy density of the fluid.
To derive the equations of motion we need to vary the action, which involves varying the flow lines. This changes v\boldsymbol{v}, so while we vary the flow lines, we imagine adjusting rho_(0)\rho_{0} to ensure that the current J\boldsymbol{J} is conserved throughout. This enables us to continue using the conservation law. The variation of the flow lines can be followed by defining K\boldsymbol{K} as the displacement vector of a point on a flow-line as we vary the flow. How much the flow has changed can then be measured by carrying the flow pattern along the vector field K\boldsymbol{K} and comparing the tangent fields before and after. That is to say, the effect of the variation is contained in the variation in w\boldsymbol{w}, which is measured by a Lie derivative
What follows is a long computation that we split into several steps. ^(26){ }^{26}
Example 40.14
Step 0: Write the variation in SS in terms of a differential deltarho_(0)\delta \rho_{0} in the usual way, by saying
{:[delta S=-intd^(4)xsqrt(-g)((d)L)/((d)rho_(0))deltarho_(0)],[(40.80)=-intd^(4)xsqrt(-g)(1+(d(rho_(0)u))/(drho_(0)))deltarho_(0)=0.]:}\begin{align*}
\delta S & =-\int \mathrm{d}^{4} x \sqrt{-g} \frac{\mathrm{~d} \mathcal{L}}{\mathrm{~d} \rho_{0}} \delta \rho_{0} \\
& =-\int \mathrm{d}^{4} x \sqrt{-g}\left(1+\frac{\mathrm{d}\left(\rho_{0} u\right)}{\mathrm{d} \rho_{0}}\right) \delta \rho_{0}=0 . \tag{40.80}
\end{align*}
Step I: In order to use the expression for delta S\delta S, we need an expression for deltarho_(0)\delta \rho_{0}. One can be found from our conservation of current equation. Since delta(J^(alpha)_(;alpha))=0=(deltaJ^(alpha))_(;alpha)\delta\left(J^{\alpha}{ }_{; \alpha}\right)=0=\left(\delta J^{\alpha}\right)_{; \alpha} we have
Step II: In order to use the previous equation, we next need an expression for delta v\delta \boldsymbol{v}. This can be extracted from delta w\delta \boldsymbol{w}, which was given by the Lie derivative. So since v=[-g(w,w)]^(-(1)/(2))w\boldsymbol{v}=[-\boldsymbol{g}(\boldsymbol{w}, \boldsymbol{w})]^{-\frac{1}{2}} \boldsymbol{w} we have, using the chain rule,
where, in the final line, we've written everything in terms of components of v\boldsymbol{v} and the displacement field K\boldsymbol{K}.
Step III: Substitute for deltav^(alpha)\delta v^{\alpha} in eqn 40.81 (which, remember, only describes the conservation law J^(alpha)_(;alpha)=0J^{\alpha}{ }_{; \alpha}=0 ). We find that the number of terms seems to increase rather alarmingly! Specifically, we find
This looks truly formidable, but can be simplified dramatically if we integrate along the flow lines. The key here is that covariant derivatives (identified in the above using the semicolon notation) are directional derivatives along the flow lines. Integrating along the flow lines can then be thought of as undoing these covariant derivatives. Along the flow lines we have local mass conservation ^(27){ }^{27} written as J^(alpha)_(;alpha)=(rho_(0)v^(alpha))_(;alpha)=J^{\alpha}{ }_{; \alpha}=\left(\rho_{0} v^{\alpha}\right)_{; \alpha}= 0 . It turns out that if we collect terms which contain a factor v^(alpha)_(;alpha)v^{\alpha}{ }_{; \alpha} and demand that these vanish, this provides the relationship between deltarho_(0)\delta \rho_{0} and v^(alpha),rho_(0)v^{\alpha}, \rho_{0} and K^(alpha)K^{\alpha} that we need. The instruction to integrate along the flow lines is especially important as it allows us to integrate those terms involving a covariant derivative by parts, effectively changing the positions of the semicolons. This allows us to massage the indices in such a way as to invoke (rho_(0)v^(alpha))_(;alpha)=0.^(28)\left(\rho_{0} v^{\alpha}\right)_{; \alpha}=0 .{ }^{28} The terms contributing a factor v^(alpha)_(;alpha)v^{\alpha}{ }_{; \alpha} are
It's not obvious that the second term is one of the ones we want, until we integrate it by parts to get -int(rho_(0)K^(beta))_(;beta)v^(alpha)_(;alpha)-\int\left(\rho_{0} K^{\beta}\right)_{; \beta} v^{\alpha}{ }_{; \alpha}.
We have now collected all terms in the integrand with a factor v^(alpha)_(;alpha)v^{\alpha}{ }_{; \alpha}, and since the divergence of v^(alpha)v^{\alpha} does not vanish (unless the density is constant) this integrand must do. The result is an equation for deltarho_(0)\delta \rho_{0} :
We can plug this into eqn 40.80 , which expresses the variation of the action. ^(27){ }^{27} In general, as we shall show below, they don't obey the geodesic equation and so v^(beta)(v^(alpha))_(;beta)!=0v^{\beta}\left(v^{\alpha}\right)_{; \beta} \neq 0. This is due to the possibility of pressure gradients perpendicular to the flow lines in the fluid. ^(28)As{ }^{28} \mathrm{As} an example of integration by parts, consider
where the integrals are carried out along the flow lines. The first term on the right vanishes as we demand that all fields die off to zero at infinity. This leave the second term. We see how the position of the ; beta\beta has changed and we have picked up a minus sign. ^(29){ }^{29} This is eqn 39.77 from the last chapter. ^(30){ }^{30} This is discussed in Exercise 34.3.
Step IV: The variation in the Lagrangian is therefore
Step V: In the final step, we integrate this equation by parts, moving the semicolons from the factors of K^(beta)K^{\beta} onto the other terms, in such a way that they have a common factor of K^(beta)K^{\beta}. The result is delta S=intd^(4)xsqrt(-g){rho_(0)[1+(d(rho_(0)u))/(drho_(0))]v^(alpha)_(;beta)v^(beta)+rho_(0)[((d)(rho_(0)u))/(drho_(0))]_(,gamma)(g^(gamma alpha)+v^(gamma)v^(alpha))}K_(alpha)=0\delta S=\int \mathrm{d}^{4} x \sqrt{-g}\left\{\rho_{0}\left[1+\frac{\mathrm{d}\left(\rho_{0} u\right)}{\mathrm{d} \rho_{0}}\right] v^{\alpha}{ }_{; \beta} v^{\beta}+\rho_{0}\left[\frac{\mathrm{~d}\left(\rho_{0} u\right)}{\mathrm{d} \rho_{0}}\right]_{, \gamma}\left(g^{\gamma \alpha}+v^{\gamma} v^{\alpha}\right)\right\} K_{\alpha}=0.
Since K_(alpha)K_{\alpha} does not vanish, the integrand must do, and we have our result for the variation of the fluid action. Finally, we interpret v^(˙)^(alpha)=v^(alpha)_(;beta)v^(beta)\dot{v}^{\alpha}=v^{\alpha}{ }_{; \beta} v^{\beta} as the acceleration of the flow lines and recall the pressure of the fluid is given by p=rho_(0)^(2)del u//delrho_(0)p=\rho_{0}^{2} \partial u / \partial \rho_{0}.
We conclude that the equation of motion for the fluid is given by
where the energy density is written as rho=rho_(0)(1+u).^(29)\rho=\rho_{0}(1+u) .{ }^{29} This the equation of motion for the fluid, Euler's equation, written for curved spacetime. As promised, the geodesic equation v^(˙)^(alpha)=0\dot{v}^{\alpha}=0 is not obeyed: the flow lines accelerate in response to transverse pressure gradients.
Next, we extract the energy-momentum tensor using out prescription from earlier (eqn 40.57) that the up components are given by T^(mu nu)=T^{\mu \nu}=2(deltaL_(m))/(deltag_(mu nu))+g^(mu nu)L_(m)2 \frac{\delta \mathcal{L}_{\mathrm{m}}}{\delta g_{\mu \nu}}+g^{\mu \nu} \mathcal{L}_{\mathrm{m}}.
Example 40.15
Starting with the matter Lagrangian L=-rho_(0)(1+u)\mathcal{L}=-\rho_{0}(1+u), we have that
We need access to the final term. This is achieved with a trick: the divergence of the current can be written in terms of the determinant of the metric matrix as ^(30){ }^{30}
This is helpful as it tells us that the quantity (-g)^((1)/(2))J^(alpha)(-g)^{\frac{1}{2}} J^{\alpha} is unchanged as the metric is varied. Dotting two currents together we have
Using deltarho_(0)=(delrho_(0))/(delg_(alpha beta))deltag_(alpha beta)\delta \rho_{0}=\frac{\partial \rho_{0}}{\partial g_{\alpha \beta}} \delta g_{\alpha \beta}, we extract the result we were after
This completes our review of the Lagrangian formulation. We shall use it in the next two chapters, where we use it in thinking about symmetry breaking in cosmology (Chapter 41) and then meet one of the most successful field theories: electromagnetism (Chapter 42).
Chapter summary
Equations of motion for matter fields can be found using
(40.1) Consider the Lagrangian for a particle of unit mass: L(x^(mu),x^(˙)^(mu))=(1)/(2)g_(mu nu)x^(˙)^(mu)x^(˙)^(nu)L\left(x^{\mu}, \dot{x}^{\mu}\right)=\frac{1}{2} g_{\mu \nu} \dot{x}^{\mu} \dot{x}^{\nu}, where dots denote a derivative with respect to proper time tau\tau.
(a) Confirm that the equation of motion for the particle is the geodesic law.
Recall from Chapter 2 that we define the canonical momentum as
The Hamiltonian allows access to another route to finding the equations of motion in terms of p_(mu)p_{\mu}, and x^(mu)x^{\mu} via a pair of first-order derivatives
These are Hamilton's equations.
(c) Use Hamilton's equations to compute the geodesic equation again.
(d) What happens to the equation of motion if a metric coefficient is independent of one of the coordinates?
(40.2) Assuming a Lagrangian of the same form as used in the last question, a particle in a Schwarzschild geometry has
(a) Compute the momenta and show the Hamiltonian H=LH=L.
(b) Use Hamilton's equations to show that p_(t)p_{t} and p_(phi)p_{\phi} are constants of the motion.
(40.3) Consider the Lagrangian density
Show that Pi^(mu)=del^(mu)phi\Pi^{\mu}=\partial^{\mu} \phi and Pi^(0)=pi=phi^(˙)\Pi^{0}=\pi=\dot{\phi}.
(40.4) The symmetry of T\boldsymbol{T} is necessary for angular momentum to be conserved. We define an antisymmetric four-dimensional angular-momentum tensor for a particle of J^(ik)=x^(i)p^(k)-x^(k)p^(i)J^{i k}=x^{i} p^{k}-x^{k} p^{i}. In flat space and Cartesian coordinates, the continuous version is then
We then define a tensor with components M^(mu nu sigma)=M^{\mu \nu \sigma}=x^(mu)T^(nu sigma)-x^(nu)T^(mu sigma)x^{\mu} T^{\nu \sigma}-x^{\nu} T^{\mu \sigma}. Conservation of angular momentum in flat (3+1)(3+1)-dimensional spacetime is then expressed via
Show that for this to be true, T\boldsymbol{T} must be symmetric.
(40.5) Show that eqn 40.88 is consistent with local mass conservation, as described in eqn 40.86 .
Inflation
Before she had drunk half the bottle, she found her head pressing against the ceiling, and had to stoop to save her neck from being broken. She hastily put down the bottle, remarking, 'That's quite enough - I hope I sha'n't grow any more.' Lewis Carroll (1832-1898) Alice's Adventures in Wonderland
Our story so far is that the Universe can be described, to a good approximation, as a Robertson-Walker spacetime that started in an initial Big-Bang singularity. However, there are some problems with an explanation of the current state of our Universe using this standard model of cosmology. We mentioned in Chapter 18 that observations suggest the Universe appears very close to being a spatially flat, or k=0k=0, spacetime. This can be expressed using the form of the Friedmann equation that reads
The implication is either that k=0k=0 for all time, or that the current density ratio Omega=rho//rho_(c)~~1\Omega=\rho / \rho_{c} \approx 1. Our observations provide an estimate of Omega~~0.3\Omega \approx 0.3, which isn't unity, but isn't massively far away either. In the case that k=0k=0, we might ask how and why the Universe was set up to be flat. If Omega~~1\Omega \approx 1 now, but was not so in the past, this would seems quite a coincidence, given the need for the Universe to be very finely tuned in the past to conspire to give this result only now. ^(1){ }^{1} So what is the reason that we find Omega~~1\Omega \approx 1 ? This collection of questions is known as the flatness problem.
A further problem is that the Universe is also observed to be surprisingly homogeneous at earlier times. ^(2){ }^{2} This is especially surprising in the context of a hot Big Bang model in which the matter in our Universe cools very rapidly as it expands.
Example 41.1
A rough picture of the earliest stages of the Universe can be summarized as follows Evolution occurs via several phase transitions, the phenomena that form the basis of this chapter
At a time t~~10^(-43)st \approx 10^{-43} \mathrm{~s} after the Big Bang, the temperature scale of the Universe is T~~10^(32)KT \approx 10^{32} \mathrm{~K} and the corresponding energy scale is E~~10^(19)GeVE \approx 10^{19} \mathrm{GeV}. Under these condition, the strong and electroweak interactions are unified into a single interaction. On cooling below this scale (at time t~~10^(-36)st \approx 10^{-36} \mathrm{~s}, energy 10^(15)GeV10^{15} \mathrm{GeV} or temperature 10^(27)K10^{27} \mathrm{~K} ) a phase transition ^(3){ }^{3} takes place causing the strong and electroweak interactions to separate.
41
41.1 Symmetry breaking 446 41.2 Effective potentials 449 41.3 Why flat? 449 41.3 Why flat? 451 Chapter summary 452 Exercises 452 ^(1){ }^{1} The implications of the apparent finely tuned nature of the Universe have generated a lot of debate. For more on this issue, see the book A Fortunate Universe by Lewis and Barnes (listed in the Further Reading in Appendix A) or the article on this subject in the Stanford Encyclopedia of Philosophy (https://plato.stanford.edu/). ^(2){ }^{2} The Cosmic Background Explorer (COBE) satellite showed that the cosmic microwave background (CMB) has a near-perfect black-body spectrum with average temperature T=2.73KT=2.73 \mathrm{~K}, with only faint anisotropies at a level of 1 part in 10^(5)10^{5}. ^(3){ }^{3} This is known as the 'GUT phase transition', referencing putative Grand Unified (GUT) theories, that describe the combination of the strong and electroweak interactions. ^(4){ }^{4} By 'not causally connected' we mean that photons from one region have not had time to travel to another region, much less establish thermal equilibrium.
Fig. 41.1 (a) Penrose diagram show ing the horizon problem. The dotted line represents a phase transition (P.T.) and B.B. labels the Big-Bang singularity. Events qq and rr do not have any shared history, and neither do events ss and tt that take place at the instant of the phase transition. (b) Inflation pushes back the Big Bang, so that the pasts of events qq and rr overlap for a period after the Big Bang. Events ss and tt that take place at the phase transition, then also have a shared history. ^(5){ }^{5} Inflation was originally suggested by Alan Guth (1947- ) who examined the case of a scalar field in a spatially flat Universe as we do in this chapter. ^(6){ }^{6} See Penrose (2004) for a discussion. ^(7){ }^{7} L. D. Landau (1908-1968), perhaps best known for foundational work in condensed matter physics, also made contributions to general relativity, including the independent computation of the Chandrasekhar limit (the maxof the Chandrasekhar limit (the max-
imum mass of a stable white dwarf) in 1932.
Fig. 41.2 The magnet at T > T_(c)T>T_{\mathrm{c}}. The magnetization, or average moment, is zero in (a) and, after rotation of each spin through 180^(@)180^{\circ}, is still zero in (b).
At t~~10^(-11)s(:}t \approx 10^{-11} \mathrm{~s}\left(\right. or {:T~~10^(15)(K),E~~10^(2)GeV)\left.T \approx 10^{15} \mathrm{~K}, E \approx 10^{2} \mathrm{GeV}\right) another phase transition occurs causing the Higgs scalar field to take on a finite value. The result is that the electromagnetic and weak interactions separate. The Higgs mechanism causes some of the leptons to become massive. This is known as the electroweak symmetry-breaking transition.
After t~~10^(-5)st \approx 10^{-5} \mathrm{~s} the Universe is cool enough ( T~~10^(12)K,E~~100MeVT \approx 10^{12} \mathrm{~K}, E \approx 100 \mathrm{MeV} ) for a further phase transition to occur. This one allows the strong interaction to increase in strength, leading to quark confinement.
The expansion of the early Universe implies that regions of space that cannot have been causally connected ^(4){ }^{4} in the early stages of the cooling and expansion of the Universe are now found to be at almost exactly the same temperature. This is known as the horizon problem. Figure 41.1(a) shows a Penrose diagram illustrating the problem. An observer at pp can see two events (q(q and rr ) by looking in different directions. These events are not causally connected, in that they don't have histories that overlap at any period after the Big Bang. It is therefore a problem if the regions of spacetime around the events are near-identical.
To explain these problems we might suggest that the Universe underwent a period of very rapid expansion in its early history, known as inflation. ^(5){ }^{5} This involves an acceleration of the scale factor a(t)a(t) of the Universe. [This is in contrast to most of the results we have examined in which a(t)a(t) is decelerating.] Inflation allows causally connected regions of spacetime to separate rapidly enough to give the homogeneous temperature distribution that is observed. As illustrated in Fig. 41.1(b), it effectively pushes back the Big Bang, allowing events taking place at a phase transition to have some region of shared history. It is notable that this does not necessarily solve the horizon problem, as although there is some communication in the past, events such as ss and tt in Fig. 41.1(b) do not share much of their history, so might well be rather different at the point of the phase transition. ^(6){ }^{6} Inflation also results in a spatially flat Universe.
Inflation can be understood using the machinery of field theory that we discussed in the last chapter. The key concept is one familiar from thermodynamics that has been invoked in the evolution of the early Universe: the symmetry-breaking phase transition.
41.1 Symmetry breaking
An intuitive and powerful description of a continuous phase transition was developed by Lev Landau. ^(7){ }^{7} Our discussion of Landau's theory of phase transitions begins with a simple observation about magnets. We imagine a magnet whose spin moments can point either up or down. At high temperatures each spin is equally likely to be found up or down. This system is shown in Fig. 41.2(a). Its magnetization, that is, the spatial average of the magnetic moment, is zero. The system has a symmetry: turn all of the spins through 180^(@)180^{\circ} [as in Fig. 41.2(b)] and the magnetization is still zero. Of course, each individual moment is point-
ing in the opposite direction, but the number pointing in the upwards direction is still half of the total and the magnetization is still zero. The symmetry here is a global one: we rotate all of the spins through the same angle, here 180^(@)180^{\circ}.
It is found experimentally that upon cooling the system through a critical temperature T_(c)T_{c}, the system undergoes a phase transition and the magnetization MM becomes non-zero as all of the spins line up along a single direction, as shown in Fig. 41.3(a). The direction along which the spins align could either be all in the up direction or the down direction. If we rotate each spin of the aligned system through 180^(@)180^{\circ} [Fig. 41.3(b)] MM is obviously reversed. We say that the system has broken (or lowered) its symmetry in the ordered phase.
One puzzling feature of this description is the reason why the system chose to point all of the spins in one direction rather than the other. After all, there's nothing in the Hamiltonian which describes the system that distinguishes between up and down. ^(8){ }^{8} The result is that the ground state does not have the symmetry of the Hamiltonian describing the system. The original symmetry of the system appears to have spontaneously broken. The same thing happens in the Euler strut shown in Fig. 41.4. A weight is balanced on top of an elastic strut. If the weight is large enough the strut will buckle. The buckling can be either to the left or to the right. There is nothing in the underlying physics of the strut and weight that allows one to predict which way it will go. ^(9){ }^{9}
Since equilibrium in thermodynamic systems relies on both minimizing the internal energy UU and maximizing the entropy SS of a system, Landau considered the free energy F=U-TSF=U-T S. To find the equilibrium state of the system we need to minimize FF. The free energy is a function of an order parameter, namely some field describing the system whose thermal average is zero in the T > T_(c)T>T_{\mathrm{c}} unbroken-symmetry state and non-zero in the T < T_(c)T<T_{\mathrm{c}} broken-symmetry state. For a magnet, the order parameter is simply the magnetization field MM. With deliberate ignorance of the microscopic state of the system, Landau wrote F(M)F(M) as a power series
where aa and bb (the latter assumed > 0>0 ) are parameters which are independent of MM, but may in principle depend on the temperature. This free energy has the symmetry M rarr-MM \rightarrow-M, that is, reversing the magnetization doesn't affect the energy. If aa and bb are positive we have the free energy shown in Fig. 41.5(a), which is minimized at M=0M=0, which is the correct prediction for the high-temperature regime. If, however, the parameter aa is negative then we have the free energy shown in Fig. 41.5(b), which has two minima at non-zero values of magnetization. These minima correspond to the spins all aligning up (M=+M_(0))\left(M=+M_{0}\right) or all aligning down (M=-M_(0))\left(M=-M_{0}\right). The previous minimum M=0M=0 is now at a position of metastable equilibrium. ^(10){ }^{10} If we take a=a_(0)(T-T_(c))a=a_{0}\left(T-T_{c}\right) (with a_(0)a_{0} a constant) and b( > 0)b(>0) to be TT-independent, then clearly aa is positive for T > T_(c)T>T_{c} and negative for T < T_(c)T<T_{c}, and so we predict a phase transition at a temperature T=T_(c)T=T_{\mathrm{c}}.
Fig. 41.3 For temperatures T < T_(c)T<T_{\mathrm{c}} the spins align along a single direction. The magnetization in (a) is different in (b), where each spin has been rotated by 180^(@)180^{\circ}. ^(8){ }^{8} For the magnet this Hamiltonian could be hat(H)=-Jsum_(i) hat(S)_(i)^(z) hat(S)_(i+1)^(z)\hat{H}=-J \sum_{i} \hat{S}_{i}^{z} \hat{S}_{i+1}^{z}, where JJ is a coupling constant, hat(S)_(i)^(z)\hat{S}_{i}^{z} is the spin operator for the ii th spin and the up direction is along zz. This Hamiltonian rection is along 2 . This either favouring the spins pointing up or favouring them pointing down, only for them pointing in the same direction.
(c)
Fig. 41.4 The Euler strut. (a) This has a vertical axis of symmetry and can buckle either (b) one way or (c) the other, in each case breaking symmetry. ^(9){ }^{9} Of course, in real life, some very small perturbation or fluctuation will have tipped the system towards its choice of
Fig. 41.5 The Landau free energy for (a) T > T_(c)T>T_{\mathrm{c}} with a minimum at M=M= 0 and (b) for T < T_(c)T<T_{\mathrm{c}} with minima at +-M_(0)\pm M_{0}. ^(10){ }^{10} This means that any slight perturbation of the system which has been prepared in a M=0M=0 state will drop it into one, or other, of the new minima that have emerged at +-M_(0)\pm M_{0}.
Example 41.2
We can find the minima of FF straightforwardly for T < T_(c)T<T_{\mathrm{c}}. We set
{:(41.3)(del F)/(del M)=2aM+4bM^(3)=0:}\begin{equation*}
\frac{\partial F}{\partial M}=2 a M+4 b M^{3}=0 \tag{41.3}
\end{equation*}
from which we conclude that the minima occur at M=M_(0)M=M_{0} where
The square root gives a real M_(0)M_{0} since a < 0a<0. There is also a maximum at M=0M=0.
For T < T_(c)T<T_{\mathrm{c}} we have two minima at M=+-M_(0)M= \pm M_{0}. The system will be propelled into one of these as its new ground state and the symmetry will be broken.
Let's now apply the arguments developed for magnets to our Lagrangian treatment of field theory. Here instead of following the ground state magnetization, we're interested in the equilibrium state value of phi(x)\phi(x), which is the order parameter for the field theory. We start with a scalar field theory
where we've split off all of the potential-energy-like terms and called them the potential energy density U(phi)U(\phi). For the so-called phi^(4)\phi^{4} theory we have that
where mu\mu and lambda( > 0)\lambda(>0) are parameters. This function resembles the free energy in our magnet example above. It admits the global symmetry phi(x)rarr-phi(x)\phi(x) \rightarrow-\phi(x). Assuming mu^(2)\mu^{2} is positive we have a potential with a minimum at phi=0\phi=0 as shown in Fig. 41.6(a). The minimum corresponds to a ground state, also known as the vacuum, of phi(x)=0\phi(x)=0.
What if we swap the sign in front of mu^(2)\mu^{2} ? In that case we have U(phi)=U(\phi)=-(mu^(2))/(2)*phi^(2)+(lambda)/(4!)phi^(4)-\frac{\mu^{2}}{2} \cdot \phi^{2}+\frac{\lambda}{4!} \phi^{4}, and a potential that looks like Fig. 41.6(b). The minima ^(11){ }^{11} of the potential now occur at
so we have the choice of two new vacua. Once the temperature has been lowered such that we have the characteristic broken-symmetry potential, we imagine the system lowering its energy by adopting one of the states phi_(0)\phi_{0} consistent with one of the minima. This can be thought of as the system rolling down the potential towards a minimum of the potential function U(phi_(0))=U_(0)U\left(\phi_{0}\right)=U_{0}.
This mechanism, involving a scalar field that breaks a symmetry, is the one responsible for the Higgs mechanism that leads to the generation of mass in the electroweak part of the Standard Model. Since there is good evidence for the existence of a scalar Higgs field, there is certainly motivation to examine the effect of scalar fields on the expansion of the Universe. This combination of a field and an expanding spacetime will provide a model of inflation.
41.2 Effective potentials
Our model for inflation involves the phase transition of a scalar field similar to that discussed above. ^(12){ }^{12} Let's consider a scalar field phi(x)\phi(x) in the Universe before the GUT phase transition. We'll use the phi^(4)\phi^{4} Lagrangian
so that the potential U=(mu^(2)(T))/(2)phi^(2)+(lambda)/(4!)phi^(4)U=\frac{\mu^{2}(T)}{2} \phi^{2}+\frac{\lambda}{4!} \phi^{4}. This field has an equation of motion
In eqn 41.8, T(t)T(t) is the time-dependent temperature which causes mu^(2)(T)\mu^{2}(T) to change sign as a function of time below a critical temperature T_(c)T_{\mathrm{c}}. At this point the field phi\phi undergoes a phase transition. In the cosmology we're dealing with, we shall assume that the field phi\phi is always uniform in space, so that vec(grad)phi=0\vec{\nabla} \phi=0 at all times. Crucially, phi\phi will vary as a function of time as the phase transition takes place.
We shall now show how the expansion of a Robertson-Walker spacetime causes a frictional term in the equation of motion for phi\phi, that depends on the rate of expansion of spacetime. The energy-density and pressure of the phi\phi field must also have an effect on the evolution of the Universe, determining the rate of expansion. Taken together, the resulting coupled equations of motion can lead to the rapid expansion of the Universe we're looking for. In the example below, we shall simplify the description for that appropriate to a flat Universe. ^(13){ }^{13}
in a spatially flat Robertson-Walker spacetime ^(14){ }^{14} with diagonal metric g^(tt)=-1g^{t t}=-1 and g^(ii)=1//a(t)^(2)g^{i i}=1 / a(t)^{2}. The equations of motion are
^(12){ }^{12} This field should not be identified with the Higgs field, but is rather some other field in the Universe. Confusingly, it is sometime called a Higgs field. has Gamma^(0)_(ij)=aa^(˙)delta_(ij)\Gamma^{0}{ }_{i j}=a \dot{a} \delta_{i j} and Gamma_(j0)^(i)=(a^(˙)//a)delta_(j)^(i)\Gamma_{j 0}^{i}=(\dot{a} / a) \delta_{j}^{i}.
The second term of the latter equation supplies an effective drag factor 3H=3((a^(˙)))/((alpha^(˙)))3 H=3 \frac{\dot{a}}{\dot{\alpha}} that damps the motion of the phi\phi field. Our theory has an energy-momentum tensor given by
In the event that the fields are constant in time, we have rho=-p\rho=-p, which, comparing to the case of a fluid ^(15){ }^{15} gives us an effective cosmological constant Lambda=8pi U(phi)\Lambda=8 \pi U(\phi). This is promising, since it implies that the non-zero scalar field can potentially cause the sort of rapid expansion we saw in the de Sitter spacetime of Universe 1 in Chapter 15. We also need to find the equation of motion for a(t)a(t). The Friedmann equation for k=0k=0 tells us that
This completes the necessary ingredients to describe inflation.
The two most important equations from our analysis of the scalar field in an expanding universe are then: (i) the damped equation of motion for the phi\phi field
The key now is to look at the phase transition in the phi\phi field and how phi\phi makes its way to its new minimum as a function of time for T < T_(c)T<T_{\mathrm{c}}. To make the description simple we make the slow-rolling approximation, that at temperatures below the transition, the phi\phi field will change slowly in time. This requires a shallow potential U(phi)U(\phi), as shown in Fig. 41.7.
Slow rolling means that we have a negligible value of phi^(¨)\ddot{\phi} so that the equation of motion for the phi\phi field (eqn 41.21 ) becomes 3(a^(˙)//a)phi^(˙)=3(\dot{a} / a) \dot{\phi}=-del U//del phi-\partial U / \partial \phi. Near the transition the flat potential takes a near-constant value U(phi)~~U(phi_(0))=U_(0)U(\phi) \approx U\left(\phi_{0}\right)=U_{0}, which implies that phi^(˙)\dot{\phi} also becomes small. What we're really interested in is the equation of motion for the expansion factor of the Universe a(t)a(t), and so, considering eqn 41.22 with small phi^(˙)\dot{\phi}, we deduce that
The theory therefore predicts an inflationary period where, owing to the behaviour of the phi\phi field close to its phase transition and the consequences on its energy density, the expansion factor a(t)a(t) of the Universe increases exponentially.
41.3 Why flat?
In the above discussion, we have assumed the starting point of inflation was a k=0k=0 universe. ^(16){ }^{16} In this section, we shall see that the result of inflation is a flat Universe and, in fact, the end result of an inflationary expansion is always a flat space, no matter which curvature we start with. To understand this we first note that rapid expansion is always seen in models with a cosmological constant Lambda\Lambda and we can therefore compare our exponential expansion to these other analogous universes with cosmological-constant-driven expansion. These models predict expansions
{:(41.25)a(t)prop{[cosh Ht,(k=+1)","],[e^(Ht),(k=0)","],[sinh Ht,(k=-1)","]:}:}a(t) \propto \begin{cases}\cosh H t & (k=+1), \tag{41.25}\\ \mathrm{e}^{H t} & (k=0), \\ \sinh H t & (k=-1),\end{cases}
with H^(2)=Lambda//3H^{2}=\Lambda / 3. Owing to the behaviour of the hyperbolic functions, each of these cases evolve towards an exponential expansion. That is, they are eventually all consistent with the Omega=1\Omega=1 case that results in k=0k=0 flat space. So, even if Omega\Omega is not initially set close to 1 , then exponential expansion forces Omega\Omega towards this value very rapidly.
The Friedmann equations tell us that in the absence of other sources of cosmological constant Lambda\Lambda, the acceleration of the Universe obeys
so in order to achieve the positive acceleration needed for inflation we require p < -rho//3p<-\rho / 3. In terms of the estimates for pp and rho\rho from our scalar field (eqns 41.16 and 41.17), this condition is equivalent to
The condition for expansion provides a means for inflation to cease during the rolling of the field phi\phi along the potential: that is, when the speed becomes large enough and/or the value of U(phi)U(\phi) is sufficiently small. One possibility is that phi\phi reach the minimum at U_(0)U_{0} and then oscillates about this value as the damping term reduces the kinetic energy (1)/(2)phi^(˙)^(2)\frac{1}{2} \dot{\phi}^{2} until the field is at rest. If U_(0) < 0U_{0}<0 at the minimum then inflation stops. ^(17){ }^{17} In practice, the presence of coupling to other fields will cause reheating, where ^(16){ }^{16} Inflation occurs at the start of the expansion of the Universe so the effect of a particular value of kk will be difficult to detect before the Universe has evolved sufficiently. Other values of kk can be treated in the same spirit, but the differences in the predictions but the differences in the predictions
are small compared to the observables. are small compared to the observables.
Treating k=0k=0 is therefore a good approximation. ^(17){ }^{17} The condition therefore also provides a means for inflation to continue indefa means for inflation to continue indef-
initely if the field comes to rest at the initely if the field come
minimum and U_(0) > 0U_{0}>0.
the scalar field decays during the roll, producing other particle excitations. At the end of the period of inflation we are then left with a flat Universe filled with normal matter and radiation, allowing a standardmodel expansion of the Universe afterwards.
Although the inflationary model has been hugely influential over the last few decades, it is not accepted by absolutely all cosmologists. Is it the correct model of the early Universe? It is probably fair to say that the jury is still out. However, cosmic inflation remains the reigning paradigm. As we've seen in this chapter, it solves some knotty problems in cosmology and it is likely to continue to remain our best picture of the Universe just after the Big Bang, unless of course some other revolutionary new idea or piece of contradictory observational evidence turns up. We discuss the current status of the field in Chapter 49.
Chapter summary
To solve the flatness and horizon problems of cosmology, inflation theory proposes a rapid period of expansion in the early life of the Universe.
The theory relies on the notion of a symmetry-breaking phase transition that takes place as the Universe cools.
The slow-rolling approximation predicts an exponential expansion of the Universe for a limited period of time, leading to a spatially flat Universe at the end of the expansion.
Exercises
(41.1) (a) Compute the equation of motion for a scalar field Phi\Phi described by the Lagrangian in eqn 41.5 in a Robertson-Walker universe with k!=0k \neq 0.
Use the connection coefficients given in Chapter 16 (b) Verify that the equation of motion reduces to the expected form for k=0k=0.
The electromagnetic field
Electrical force is defined as something which causes motion of electrical charge; an electrical charge is something which exerts electric force.
Arthur Eddington (1882-1944) The Nature of the Physical World
O'er the wires the electric message came,
'He is no better; he is much the same.'
Alfred Austin (1835-1913) (attrib.) On the Illness of the Prince of Wales
In this chapter, we use the geometrical techniques from the previous part of the book to reformulate electromagnetism. In some ways, electromagnetism can be viewed as one of the simplest and most successful field theories in Nature. Einstein certainly thought so, and would return to electromagnetism regularly for inspiration throughout his formulation of relativity. In much the same way that in Chapter 0, where we characterized gravitation with John Wheeler's slogan
spacetime tells matter how to move; matter tells spacetime how to curve,
we can similarly sloganize electromagnetism by saying
electromagnetic fields tells electric charges how to move; electric charges tells electromagnetic fields where to go.
We shall return to the analogy between electromagnetism and gravitation repeatedly in the coming chapters. This chapter aims to set the scene. We start by working in flat, Minkowski spacetime. ^(1){ }^{1} We shall produce a Lagrangian description of electromagnetism in terms of the components of its field. We then formulate a description of electromagnetism using the coordinate-free, geometrical machinery of tensors.
42.1 Electric charge in a field
A charge can be thought of as that property of a particle that couples to a field. How could a tensor field cause a charge to move? We have seen the answer for the case of the metric field where curvature causes the movement of massive particles. However, there are other ways that
42.1 Electric charge in a field 453 42.2 Faraday tensor and Maxwell equations 455 42.3 Gauge freedom 458 42.4 Geometrical electromagnetism 460 Chapter summary 464 Exercises 464 ^(2){ }^{2} Recall that the Lagrangian description of fields relies on the components of tensors. We return to our coordinatefree, geometrical description below.
a field can cause a change in motion of a charge. Let's ask what the simplest method of coupling a particle and a field might be. This, it turns out, is how electromagnetism manifests itself in Nature,
The relativistic action for a free massive particle is given by
where in flat, Minkowski space the interval ds^(2)=eta_(mu nu)dx^(mu)dx^(nu)\mathrm{d} s^{2}=\eta_{\mu \nu} \mathrm{d} x^{\mu} \mathrm{d} x^{\nu}. We would like this particle to interact with another field. Perhaps the simplest coupling that obeys Lorentz symmetry couples the displacement of the particle dx^(mu)\mathrm{d} x^{\mu} to the components of a 1 -form field tilde(A)(x)\tilde{\boldsymbol{A}}(x). This results in a contribution to the action of ^(2){ }^{2}
where the charge qq tells us the strength of the coupling between the field and the displacement. This action describes the interaction between the electromagnetic field ^(3) tilde(A)(x){ }^{3} \tilde{\boldsymbol{A}}(x) and a particle with electric charge qq. The particle is assumed massive and so we choose to parametrize its path with an affine parameter tau\tau. We then have a total action SS for a particle in the field tilde(A)(x)\tilde{\boldsymbol{A}}(x) given by the sum of the free-particle action and the interaction of the particle and the electromagnetic field ^(4)S=S_(m)+S_(em){ }^{4} S=S_{\mathrm{m}}+S_{\mathrm{em}} or
This action can be used to derive the equations of motion for the particle.
Example 42.1
If the action is given by S=int LdtauS=\int L \mathrm{~d} \tau, then we can write the Lagrangian as L=L=L_(m)+L_(em)L_{\mathrm{m}}+L_{\mathrm{em}}. We then utilize the Euler-Lagrange equations for each part in turn. The kinetic energy contribution of the free particle is
where u^(beta)=dx^(beta)//dtauu^{\beta}=\mathrm{d} x^{\beta} / \mathrm{d} \tau are the components of the velocity, and the interaction is encoded in the tensor components
In fact, the F_(alpha beta)F_{\alpha \beta} are the components of the Faraday tensor. ^(6){ }^{6} This is a (0,2)(0,2) tensor which can be written in a basis as
We shall see that it is this tensor which gives us access to the electromagnetic fields. We can summarize the equation of motion symbolically as^(8)\mathrm{as}^{8}
where the momentum 1 -form features on the left-hand side and the velocity u\boldsymbol{u} has been inserted into the second slot of the tensor tilde(F)\tilde{\boldsymbol{F}} (corresponding to contracting against the second index of F_(alpha beta)F_{\alpha \beta} in eqn 42.17). Since this expression is a valid tensor expression in flat space, it must also be applicable to curved space and represents the answer to our question of how a charge couples to the electromagnetic field. Next, we turn to the dynamics of the 1 -form field tilde(A)\tilde{\boldsymbol{A}} and the 2 -form tilde(F)\tilde{\boldsymbol{F}}, as these provide the equations of motion of the electromagnetic fields, also known as Maxwell's equations. ^(9){ }^{9}
42.2 Faraday tensor and Maxwell equations
The field tilde(A)(x)=A_(mu)(x)dx^(mu)\tilde{\boldsymbol{A}}(x)=A_{\mu}(x) \boldsymbol{d} x^{\mu} is a 1-form field known as the electromagnetic gauge field. The four components A_(mu)=(A_(0),A_(i))A_{\mu}=\left(A_{0}, A_{i}\right) may, in flat ^(5){ }^{5} As examined in the Exercises, the spatial components of the right-hand side of this equation give the relativistic version of the Lorentz force law
where the antisymmetry of the components is used to produce the last line. ^(8){ }^{8} That is, we lower the index with x_(alpha)=x_{\alpha}=eta_(alpha mu)x^(mu)\eta_{\alpha \mu} x^{\mu} and then see that the left-hand side can be written as
where u_(alpha)=dx_(alpha)//dtauu_{\alpha}=\mathrm{d} x_{\alpha} / \mathrm{d} \tau, before noting that tilde(p)=m tilde(u)\tilde{\boldsymbol{p}}=m \tilde{\boldsymbol{u}}. Note also that since tilde(F)(u,u)=0\tilde{\boldsymbol{F}}(\boldsymbol{u}, \boldsymbol{u})=0, which follows from the antisymmetry of the Faraday tensor, we have on the left-hand side an expression that effectively says u*a=0\boldsymbol{u} \cdot \boldsymbol{a}=0, where a\boldsymbol{a} is the acceleration, which is a condition required for our theory (see Chapter 1). Finally, the upgrade to curved space simply swaps commas for semispace simply
colons to give
The equation of motion tells us that the departure from geodesic free fall is given by the combination of the charge, Faraday tensor and particle velocity. ^(9){ }^{9} James Clerk Maxwell (1831-1879). ^(10){ }^{10} As usual, A^(i)A^{i} are the components of the spatial 3 -vector vec(A)\vec{A}. ^(11){ }^{11} Note that in flat space we can also transform these to the components of a (2,0)(2,0) tensor F=F^(mu nu)e_(mu)oxe_(nu)=\boldsymbol{F}=F^{\mu \nu} \boldsymbol{e}_{\mu} \otimes \boldsymbol{e}_{\nu}=_((1)/(2))^(1)F^(mu nu)e_(mu)^^e_(nu){ }_{\frac{1}{2}}^{1} F^{\mu \nu} \boldsymbol{e}_{\mu} \wedge \boldsymbol{e}_{\nu}, where F^(mu nu)=F^{\mu \nu}=
Notice how the components of the vec(E)\vec{E} field change their signs, but those of the vec(B)\vec{B} field do not. ^(12){ }^{12} It is therefore JJ that tells the field lines where to go (and the field lines tell J\boldsymbol{J} how to evolve in spacetime). ^(13){ }^{13} This is related to the Bianchi identity for gravitation that we have met in Chapter 13. The link will be discussed in detail in Chapter 43, but we shall show how the identity arises geometrically in electromagnetism at the end of this chapter ^(14){ }^{14} Set alpha=1,beta=2\alpha=1, \beta=2 and gamma=3\gamma=3. ^(15){ }^{15} Set one of the indices equal to zero and this follows.
space, be mapped with metric eta\boldsymbol{\eta} onto a ( 1,0 ) vector field A(x)\boldsymbol{A}(x) with components A^(mu)=(A^(0),A^(i))A^{\mu}=\left(A^{0}, A^{i}\right), where A_(0)=-A^(0)A_{0}=-A^{0} and ^(10)A_(i)=A^(i){ }^{10} A_{i}=A^{i}. The timelike component A^(0)A^{0} is often called the electrostatic potential VV.
We then define the 3 -vector electric field vec(E)\vec{E} in terms of the components of the gauge field as
An important property of these equations is that there is some freedom in how the components A_(mu)A_{\mu} are specified. In fact, if we make the gauge transformation
where chi(x)\chi(x) is some arbitrary function, then the vec(E)\vec{E} and vec(B)\vec{B} fields are unchanged. This turns out to be of fundamental importance to the theory and is discussed further in the next section. Our task here is to derive the equations of motion of the electromagnetic field.
Since the components of the Faraday tensor are given by F_(mu nu)=A_(nu,mu)-F_{\mu \nu}=A_{\nu, \mu}-A_(mu,nu)A_{\mu, \nu} then, in terms of the components of vec(E)\vec{E} and vec(B)\vec{B} fields, we can write the matrix as ^(11){ }^{11}
The components of tilde(F)\tilde{\boldsymbol{F}} have dynamics of their own, encoded in Maxwell's equations of motion. Recall that Maxwell's equations are written in natural units as
We combine the charge density rho\rho and the current density vec(J)\vec{J} to create the current 4-vector J\boldsymbol{J} with components J^(mu)=(rho, vec(J))J^{\mu}=(\rho, \vec{J}). This vector represents the source of the electromagnetic fields (i.e. charges and currents) and also the bodies that are set in motion by the fields. ^(12){ }^{12}
Maxwell's equations can be re-expressed in terms of the components of the Faraday tensor. It can be checked that the ^(13){ }^{13} Bianchi identity
yields the Maxwell equations ^(14) vec(grad)* vec(B)=0{ }^{14} \vec{\nabla} \cdot \vec{B}=0 and ^(15)-del vec(B)//del t= vec(grad)xx vec(E){ }^{15}-\partial \vec{B} / \partial t=\vec{\nabla} \times \vec{E}. The other two Maxwell equations (involving the components of the current J)J) can be recreated using the equation
Example 42.2
Returning to the Lagrangian point of view, we claimed in Chapter 40 that the Lagrangian density for the electromagnetic matter fields themselves was L=\mathcal{L}=-(1)/(4)F_(mu nu)F^(mu nu)-\frac{1}{4} F_{\mu \nu} F^{\mu \nu}. We can also include a generalized version of the coupling with Lagrangian L=qdx^(mu)A_(mu)L=q \mathrm{~d} x^{\mu} A_{\mu}, which becomes a contribution to the Lagrangian density of L_("emf ")=J^(mu)A_(mu)\mathcal{L}_{\text {emf }}=J^{\mu} A_{\mu}. We then have the action for the electromagnetic fields and their coupling to charges as given by
Feeding the Lagrangian density in the integrand through the Euler-Lagrange equations yields the Maxwell equations.
This is all very well, but in order to understand the Maxwell equations in the context of gravitation, we should examine the geometrical interpretation of electromagnetism. Indeed we shall shortly see how a geometrical viewpoint leads to the Maxwell equations.
Example 42.3
Before we get to a coordinate-free formulation, we can get an idea of how curvature changes electromagnetism. ^(16){ }^{16} For simplicity we set the currents J^(mu)J^{\mu} to zero. In curved space with metric components g_(mu nu)g_{\mu \nu}, we produce the Faraday tensor using the commas-go-to-semicolons rule, to give
Since we must be careful to use the metric to raise indices, we rewrite the flat-space Lagrangian density L_(em)=-(1)/(4)F^(mu nu)F_(mu nu)\mathcal{L}_{\mathrm{em}}=-\frac{1}{4} F^{\mu \nu} F_{\mu \nu}, which becomes
where the last equality follows from the E-L equations. Since the g^(alpha gamma)g^{\alpha \gamma} simply multiplies all of the terms in the sum, it can be dropped. The equation is then equivalent to
This expression gives two of the Maxwell equations in curved space, in the absence of sources. ^(17){ }^{17} ^(16){ }^{16} The electromagnetic field itself generates curvature via its energymomentum tensor, as discussed below. ^(18){ }^{18} The Poynting vector, named after J. H. Poynting (1852-1914) who invented it, is defined in classical electromagnetism as the vector vec(S)= vec(E)xx vec(H)\vec{S}=\vec{E} \times \vec{H} (equivalently vec(S)= vec(E)xx vec(B)//mu_(0)\vec{S}=\vec{E} \times \vec{B} / \mu_{0} ). Its magnitude gives the energy flux (the energy flow per unit time, per unit area) and its direction indicates the direction of the energy flow. Our expression for T^(0j)T^{0 j} gives the jj th component of vec(S)\vec{S}, with the constant mu_(0)\mu_{0} disappearing in our units. ↷\curvearrowright The concept of gauges is
important for describing sev-
eral topics in the coming chap-
ters, including gravitational
waves. Gauges are explained
in more detail in Chapter 44 . ^(19){ }^{19} It might be helpful to think of the choice of gauge as a choice of language, where we are able to communicate the same message no matter which language we choose. ^(20){ }^{20} The notation here makes use of
Central to the role of fields in general relativity is the energy-momentum tensor T(\boldsymbol{T}(,).WemetamethodinChapter 40) . We met a method in Chapter 40 that allows us to derive this object from the Lagrangian of the electromagnetic fields. In Minkowski spacetime, the ( 2,0 ) energy-momentum tensor T\boldsymbol{T} for the electromagnetic field has components
It is the tensor T(\boldsymbol{T}(,)withthesecomponentsthatcouplestothecurva-) with these components that couples to the curva- ture of spacetime.
42.3 Gauge freedom
The electromagnetic 1-form field tilde(A)(x)\tilde{\boldsymbol{A}}(x) is an important one. In several forthcoming chapters, we shall have cause to invoke the principle of gauge freedom, by using eqn 42.20 . The principle says that no physics changes if we make the change A_(mu)(x)rarrA_(mu)(x)-chi_(,mu)(x)A_{\mu}(x) \rightarrow A_{\mu}(x)-\chi_{, \mu}(x). That is to say, the gauge transformation cannot change the result of any measurement. The choice of the smooth function chi(x)\chi(x) is known as a choice of gauge. ^(19){ }^{19} A sensible choice of the function chi(x)\chi(x) can simplify a problem.
Example 42.5
Consider how the expression del_(nu)F^(mu nu)=J^(mu)\partial_{\nu} F^{\mu \nu}=J^{\mu} can be rewritten in terms of the components A^(mu)A^{\mu} as ^(20)^{20}
It would simplify our use of this equation of motion if the first term on the left-hand side could be made to go away through a sensible choice of chi\chi. To do this we write A_(mu)rarrA_(mu)^(')=A_(mu)-del_(mu)chiA_{\mu} \rightarrow A_{\mu}^{\prime}=A_{\mu}-\partial_{\mu} \chi. What we want is then
which we can satisfy by setting del^(2)chi=del^(mu)A_(mu)\partial^{2} \chi=\partial^{\mu} A_{\mu}. The equation of motion for the electromagnetic field then becomes
Note that in the absence of sources, this tells us del^(2)A^(mu)=0\partial^{2} A^{\mu}=0, which is a wave equation ^(21){ }^{21} for the components A^(nu)A^{\nu}, which is to say
The electromagnetic field propagates through space at the speed of light.
The choice of chi\chi such that del_(mu)A^(mu)=0\partial_{\mu} A^{\mu}=0 is known as ^(22){ }^{22} the Lorenz gauge. In fact, the Lorenz gauge does not exhaust the gauge freedom.
Example 42.6
This is because we can make a further shift A_(mu)^(')rarrA_(mu)^('')=A_(mu)^(')-del_(mu)xiA_{\mu}^{\prime} \rightarrow A_{\mu}^{\prime \prime}=A_{\mu}^{\prime}-\partial_{\mu} \xi as long as del^(2)xi=0\partial^{2} \xi=0 (so that both A_(mu)^(')A_{\mu}^{\prime} and A_(mu)^('')A_{\mu}^{\prime \prime} satisfy the Lorenz condition). To make A^(''mu)A^{\prime \prime \mu} unique, we shall choose
which implies A_(0)^('')=0A_{0}^{\prime \prime}=0. With this further choice, the Lorenz condition then reduces to vec(grad)* vec(A^(''))=0\vec{\nabla} \cdot \overrightarrow{A^{\prime \prime}}=0. This choice is known as Coulomb gauge. ^(23){ }^{23}
Similar choices of gauge will be important when we come to examine gravitational waves in Chapter 46. For now, it's useful to note that we can write down the solutions to the equation of motion for the electromagnetic field: del^(2)A^(mu)=-J^(mu)\partial^{2} A^{\mu}=-J^{\mu}.
Example 42.7
Consider the electromagnetic field at position vec(x)\vec{x} that results from a source at vec(x)^(')\vec{x}^{\prime}, as shown in Fig. 42.1. In the static case, we have
We arrange the currents and charges in a three-volume V^(')\mathcal{V}^{\prime} as shown in Fig. 42.1. These latter equations then have solutions ^(24){ }^{24}
In words: The charges and currents at positions vec(x)^(')\vec{x}^{\prime} are added up to give the fields at vec(x)\vec{x}. Since the electromagnetic field propagates at the speed of light, adding time dependence requires us to evaluate the charge and current distribution at the retarded time
where we temporarily reinstate the speed of light cc. This says that the field experienced at x^(mu)=(t, vec(x))x^{\mu}=(t, \vec{x}) is determined by adding up the charges and currents at vec(x)^(')\vec{x}^{\prime} at a time t_(r)t_{\mathrm{r}}, since it takes |( vec(x))- vec(x)^(')|//c\left|\vec{x}-\vec{x}^{\prime}\right| / c seconds for the field to propagate to vec(x)\vec{x}. The solutions are then given by the retarded potentials
^(21){ }^{21} A wave equation has the form del^(2)phi=\partial^{2} \phi=-(del^(2)phi)/(delt^(2))+ vec(grad)^(2)phi=0-\frac{\partial^{2} \phi}{\partial t^{2}}+\vec{\nabla}^{2} \phi=0. Its solutions are plane waves of the form phi=\phi=e^("are plane ")i(omega t- vec(k)* vec(x))..\mathrm{e}^{\text {are plane }} \mathrm{i}(\omega t-\vec{k} \cdot \vec{x}) . ~ . ~ ^(22){ }^{22} Actually it's more usually (and incorrectly) known as Lorentz gauge due to its misattribution to Hendrik Lorentz (1853-1928) rather than to the less famous Ludvig Lorenz (1829-1891) who used it first. See J. D. Jackson and L. B. Okun, Rev. Mod. Phys. 73, 663 (2001) for details of the history. In gravitation, the analogous choice of gauge is sometimes known as the harmonic gauge or de Donder gauge [the latter named after Théophile de Donder (1872-1957)]. ^(23){ }^{23} Charles-Augustin de Coulomb (1736-1806).
Fig. 42.1 The geometry for Example 42.7. The field at vec(x)\vec{x} results from a distribution of currents and charges with elements at positions vec(x)^(')\vec{x}^{\prime} in a threevolume V^(')\mathcal{V}^{\prime}. ^(24){ }^{24} See Exercise 42.7, where useful identities are given for filling in the algebra in this example. ^(25){ }^{25} Our coordinate-free expressions can be straightforwardly written in coordinates in flat or curved spacetime. We shall continue to assume flat spacetime in our discussion.
We shall later use analogous potentials to describe the gravitational metric field generated by a distribution of masses and energies.
42.4 Geometrical electromagnetism
We now, finally, re-express the equations of electromagnetism in coordinate-free language. ^(25){ }^{25} We start with the electromagnetic 1 -form gauge field tilde(A)(x)\tilde{\boldsymbol{A}}(x). We take its exterior derivative and find that we obtain the 2 -form F\boldsymbol{F}, or
We pause to prove the assertion above. We have tilde(A)=A_(nu)dx^(nu)\tilde{\boldsymbol{A}}=A_{\nu} \boldsymbol{d} x^{\nu}. The exterior derivative gives a 2 -form
We see that tilde(F)=d tilde(A)\tilde{\boldsymbol{F}}=\boldsymbol{d} \tilde{\boldsymbol{A}}, where tilde(F)=(1)/(2)F_(mu nu)dx^(mu)^^dx^(nu)\tilde{\boldsymbol{F}}=\frac{1}{2} F_{\mu \nu} \boldsymbol{d} x^{\mu} \wedge \boldsymbol{d} x^{\nu} and
come choose to write the indices for the position. EE and vec(B)\vec{B} 3-vectors in the f 4 -vions These are not components 4-vectors, so there is no implied dif
Fig. 42.2 A BB-field in the geometrical language circulates in the tubes of a 2 -form. The field F=B^(x)dy^^dz\boldsymbol{F}=B^{x} \boldsymbol{d} y \wedge \boldsymbol{d} z is represented here, with the field component B^(x)B^{x} circulating in the tube formed by dy\boldsymbol{d} y and dz\boldsymbol{d} z.
as we had identified originally.
Recall that the expression tilde(F)=d tilde(A)\tilde{\boldsymbol{F}}=\boldsymbol{d} \tilde{\boldsymbol{A}} means that the 2 -form tilde(F)\tilde{\boldsymbol{F}} is exact and, therefore, closed, with the property that d tilde(F)=dd tilde(A)=0\boldsymbol{d} \tilde{\boldsymbol{F}}=\boldsymbol{d} \boldsymbol{d} \tilde{\boldsymbol{A}}=0.
Expressing electromagnetism in terms of the 2 -form F\boldsymbol{F} allows us to illustrate the theory in a slightly different manner to the usual field lines of vec(E)\vec{E} and vec(B)\vec{B}, by using the idea from Chapter 32 that a 2 -form looks like a set of intersecting planes. Using the wedge product, we can identify the components of the Faraday 2 -form with the components of the electric and magnetic 3 -vectors ^(26){ }^{26} thus
{:[ tilde(F)=E^(x)dx^^dt+E^(y)dy^^dt+E^(z)dz^^dt],[(42.54)+B^(x)dy^^dz+B^(y)dz^^dx+B^(z)dx^^dy]:}\begin{align*}
\tilde{\boldsymbol{F}}= & E^{x} \boldsymbol{d} x \wedge \boldsymbol{d} t+E^{y} \boldsymbol{d} y \wedge \boldsymbol{d} t+E^{z} \boldsymbol{d} z \wedge \boldsymbol{d} t \\
& +B^{x} \boldsymbol{d} y \wedge \boldsymbol{d} z+B^{y} \boldsymbol{d} z \wedge \boldsymbol{d} x+B^{z} \boldsymbol{d} x \wedge \boldsymbol{d} y \tag{42.54}
\end{align*}
We can therefore picture the components E^(i)E^{i} and B^(i)B^{i} circulating in the tubes formed from the intersecting basis 1-forms dx^(mu)\boldsymbol{d} x^{\mu}. An example is shown in Fig. 42.2.
Equation 42.54 allows us access to the components of the equations of motion via eqn 42.12. Specifically, we start with the momentum 1-form tilde(p)=p_(mu)dx^(mu)\tilde{\boldsymbol{p}}=p_{\mu} \boldsymbol{d} x^{\mu} and the velocity of the particle u\boldsymbol{u}, which is tangent to its world line (parametrized by the proper time tau\tau ). The equation of motion is then d tilde(p)//dtau=q tilde(F)(,u)\mathrm{d} \tilde{\boldsymbol{p}} / \mathrm{d} \tau=q \tilde{\boldsymbol{F}}(, \boldsymbol{u}). That is, we fill one of the slots of the 2 -form F\boldsymbol{F} with the velocity u\boldsymbol{u}.
Example 42.9
A particle moves along the zz-axis. In a region where there is a constant magnetic field in the xx-direction, we have a Faraday tensor
{:(42.55) tilde(F)(",")=B^(x)dy^^dz(","):}\begin{equation*}
\tilde{\boldsymbol{F}}(,)=B^{x} \boldsymbol{d} y \wedge \boldsymbol{d} z(,) \tag{42.55}
\end{equation*}
It is clear from this geometrical expression that a gauge transformation cannot have any effect on the Faraday 2 -form tilde(F)\tilde{\boldsymbol{F}}, since
{:(42.60) tilde(F)=d tilde(A)rarr d tilde(A)-dd chi:}\begin{equation*}
\tilde{F}=d \tilde{A} \rightarrow d \tilde{A}-d d \chi \tag{42.60}
\end{equation*}
and, since dd=0\boldsymbol{d} \boldsymbol{d}=0, we see that tilde(F)rarr tilde(F)\tilde{\boldsymbol{F}} \rightarrow \tilde{\boldsymbol{F}} under a gauge transformation. Since a gauge transformation has no effect on the Faraday 2-form, it cannot affect the vec(E)\vec{E} - and vec(B)\vec{B}-fields that make up its components.
We now turn to the Maxwell equations. The first two Maxwell equations may be written in terms of the Faraday 2-form as
which we know to be true by virtue of the definition tilde(F)=d tilde(A)\tilde{\boldsymbol{F}}=\boldsymbol{d} \tilde{\boldsymbol{A}} and the fact that dd=0\boldsymbol{d} \boldsymbol{d}=0. To obtain the Maxwell equations in a familiar guise, we explicitly take an exterior derivative
Another way to express the sum of components in d tilde(F)=0\boldsymbol{d} \tilde{\boldsymbol{F}}=0 is to note that, owing to the properties of the wedge product, any term with a repeated index in eqn 42.62 vanishes. As a result, we must have that the sum of the delF_(alpha beta)//delx^(gamma)\partial F_{\alpha \beta} / \partial x^{\gamma} with indices all different vanishes. We therefore write an equivalent version of the first two Maxwell equations as
which is the Bianchi identity for electromagnetism that we gave in eqn 42.24. We'll examine its physical and geometrical significance further in the next chapter.
In order to recreate the other two Maxwell equations, we must find out how the current J\boldsymbol{J} is coupled to the electromagnetic fields. The way to couple the fields is not to use F\boldsymbol{F}, but the 2-form that results from taking the dual of the (2,0)(2,0) tensor F\boldsymbol{F}, which is ^(27){ }^{27}
^(27){ }^{27} See Chapter 37 for the recipe for forming duals. An alternative method to the one used here is to take the dual of the (0,2)(0,2) Faraday 2 -form tilde(F)\tilde{\boldsymbol{F}} to make the (2,0)(2,0) Maxwell tensor ***F\star \boldsymbol{F}. We would then need to lower the indices to construct the components of the 2 -form.
The object ***F\star \boldsymbol{F} is known as the Maxwell tensor.
Example 42.11
The dual of the (2,0) Faraday tensor F=(1)/(2)F^(mu nu)(e_(mu)^^e_(nu))\boldsymbol{F}=\frac{1}{2} F^{\mu \nu}\left(\boldsymbol{e}_{\mu} \wedge \boldsymbol{e}_{\nu}\right) is a 2-form with components
So the change from the 2 -form tilde(F)\tilde{\boldsymbol{F}} to the 2 -form ***F\star \boldsymbol{F} can be achieved by making a transformation of components corresponding to vec(E)\vec{E} changing to - vec(B)-\vec{B} and vec(B)\vec{B} changing to vec(E)\vec{E}. It is also straightforward to show that ******F=-F\star \star \boldsymbol{F}=-\boldsymbol{F}.
The Maxwell 2-form is written in terms of the 3 -vector vec(E)\vec{E} and vec(B)\vec{B} fields as
{:[***F=E^(x)dy^^dz+E^(y)dz^^dx+E^(z)dx^^dy],[(42.70)+B^(x)dt^^dx+B^(y)dt^^dy+B^(z)dt^^dz]:}\begin{align*}
\star \boldsymbol{F}= & E^{x} \boldsymbol{d} y \wedge \boldsymbol{d} z+E^{y} \boldsymbol{d} z \wedge \boldsymbol{d} x+E^{z} \boldsymbol{d} x \wedge \boldsymbol{d} y \\
& +B^{x} \boldsymbol{d} t \wedge \boldsymbol{d} x+B^{y} \boldsymbol{d} t \wedge \boldsymbol{d} y+B^{z} \boldsymbol{d} t \wedge \boldsymbol{d} z \tag{42.70}
\end{align*}
We shall see in the next chapter that the conservation of charge that is so fundamental to electromagnetism is expressed by the dual of the current ***J\star \boldsymbol{J}. It is this (0,3)(0,3) tensor that couples to the 2 -form ***F\star \boldsymbol{F} to give the second pair of Maxwell's equations.
Example 42.12
Let's take the dual of the current vector J\boldsymbol{J}. In n=4n=4 dimensions, this results in a 3 -form with components J_(alpha beta gamma)=epsi_(mu alpha beta gamma)J^(mu)J_{\alpha \beta \gamma}=\varepsilon_{\mu \alpha \beta \gamma} J^{\mu} given by
{:[***J=rho dx^^dy^^dz-J^(x)dt^^dy^^dz],[(42.72)-J^(y)dt^^dz^^dx-J^(z)dt^^dx^^dy.]:}\begin{align*}
\star \boldsymbol{J}= & \rho \boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z-J^{x} \boldsymbol{d} t \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z \\
& -J^{y} \boldsymbol{d} t \wedge \boldsymbol{d} z \wedge \boldsymbol{d} x-J^{z} \boldsymbol{d} t \wedge \boldsymbol{d} x \wedge \boldsymbol{d} y . \tag{42.72}
\end{align*}
The final piece of the puzzle is to determine how to couple the 2 -form ***F\star \boldsymbol{F} to the 3 -form ***J\star \boldsymbol{J}. The answer is to make another 3 -form by taking the exterior derivative of the Maxwell 2-form ^(29)***F{ }^{29} \star \boldsymbol{F}.
Example 42.13
Take the exterior derivative of the Maxwell 2-form to make a 3-form
{:[d***F=((delE_(x))/(del x)+(delE_(y))/(del y)+(delE_(z))/(del z))dx^^dy^^dz],[+((delE_(x))/(del t)+(delB_(z))/(del y)-(delB_(y))/(del z))dt^^dy^^dz],[+((delE_(y))/(del t)+(delB_(x))/(del z)-(delB_(z))/(del x))dt^^dz^^dx],[(42.73)+((delE_(z))/(del t)+(delB_(y))/(del x)-(delB_(x))/(del y))dt^^dx^^dy]:}\begin{align*}
\boldsymbol{d} \star \boldsymbol{F}= & \left(\frac{\partial E_{x}}{\partial x}+\frac{\partial E_{y}}{\partial y}+\frac{\partial E_{z}}{\partial z}\right) \boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z \\
& +\left(\frac{\partial E_{x}}{\partial t}+\frac{\partial B_{z}}{\partial y}-\frac{\partial B_{y}}{\partial z}\right) \boldsymbol{d} t \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z \\
& +\left(\frac{\partial E_{y}}{\partial t}+\frac{\partial B_{x}}{\partial z}-\frac{\partial B_{z}}{\partial x}\right) \boldsymbol{d} t \wedge \boldsymbol{d} z \wedge \boldsymbol{d} x \\
& +\left(\frac{\partial E_{z}}{\partial t}+\frac{\partial B_{y}}{\partial x}-\frac{\partial B_{x}}{\partial y}\right) \boldsymbol{d} t \wedge \boldsymbol{d} x \wedge \boldsymbol{d} y \tag{42.73}
\end{align*}
^(28){ }^{28} The corresponding (2,0) tensor has components
Match each component with that of the 3 -form ***J\star \boldsymbol{J}
{:[***J=rho dx^^dy^^dz-J^(x)dt^^dy^^dz],[(42.74)-J^(y)dt^^dz^^dx-J^(z)dt^^dx^^dy.]:}\begin{align*}
\star \boldsymbol{J}= & \rho \boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z-J^{x} \boldsymbol{d} t \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z \\
& -J^{y} \boldsymbol{d} t \wedge \boldsymbol{d} z \wedge \boldsymbol{d} x-J^{z} \boldsymbol{d} t \wedge \boldsymbol{d} x \wedge \boldsymbol{d} y . \tag{42.74}
\end{align*}
This results in the remaining Maxwell equations.
We conclude that the second two Maxwell equations are given by
We have now built a geometrical version of the Maxwell equations. In the next chapter, we shall look a little more closely at the notion of charge conservation and see how this allows us to understand the Maxwell equations in their geometric form.
Chapter summary
Interactions of changes with electromagnetic fields are described in flat space by the equation of motion
The Faraday tensor is the 2 -form tilde(F)=d tilde(A)\tilde{\boldsymbol{F}}=\boldsymbol{d} \tilde{\boldsymbol{A}} where tilde(A)\tilde{\boldsymbol{A}} is the gauge field.
In coordinate free-notation, the Maxwell equations can be expressed as
to show how the components of vec(E)\vec{E} and vec(B)\vec{B} transform when a boost is made along the xx axis.
(42.3) (a) Why in electronics are we free to set the zero of electric potential? Are we allowed to set zero volts differently in London and in Manchester? (b) What is the effect on the equations for the vec(E)\vec{E} -
and vec(B)\vec{B}-fields of making the changes to the components A^(mu)=(V, vec(A))A^{\mu}=(V, \vec{A}) of the electromagnetic field A(x)\boldsymbol{A}(x) of
where chi\chi is a function of space and time coordinates?
(c) Consider Maxwell's equations given in terms of the field A\boldsymbol{A}. What happens to the equations if we choose VV and vec(A)\vec{A} such that
(42.4) (a) In flat space, what vec(B)\vec{B}-field results from a vector potential with components vec(A)=(0,Cx,0)\vec{A}=(0, C x, 0), where CC is a constant?
This is known as Landau gauge.
(b) What vec(B)\vec{B}-field results from a vector potential vec(A)=(1)/(2)(-Cy,Cx,0)\vec{A}=\frac{1}{2}(-C y, C x, 0), where CC is a constant?
(42.5) (a) Show that F_(alpha beta,gamma)+F_(gamma alpha,beta)+F_(beta gamma,alpha)=0F_{\alpha \beta, \gamma}+F_{\gamma \alpha, \beta}+F_{\beta \gamma, \alpha}=0, yields two Maxwell equations.
(b) Show that F^(mu nu)_(,nu)=J^(mu)F^{\mu \nu}{ }_{, \nu}=J^{\mu} yields the other two.
(42.6) Derive the expressions linking the components of TT to the vec(E)\vec{E} and vec(B)\vec{B} fields.
(42.7) By considering the spatial Laplacian of a component vec(grad)^(2)A^(mu)\vec{\nabla}^{2} A^{\mu}, show that the retarded potential in eqn 42.50 satisfies the inhomogeneous wave equation del^(2)A^(mu)(t,x^(mu))=-J^(mu)(t,x^(mu))\partial^{2} A^{\mu}\left(t, x^{\mu}\right)=-J^{\mu}\left(t, x^{\mu}\right).
Hint: Use the identities vec(grad)R= hat(vec(R)), vec(grad)(1//R)=\vec{\nabla} R=\hat{\vec{R}}, \vec{\nabla}(1 / R)=- hat(vec(R))//R^(2), vec(grad)*( hat(vec(R))//R)=1//R^(2)-\hat{\vec{R}} / R^{2}, \vec{\nabla} \cdot(\hat{\vec{R}} / R)=1 / R^{2} and vec(grad)*(( hat(vec(R)))//R^(2))=\vec{\nabla} \cdot\left(\hat{\vec{R}} / R^{2}\right)=4pidelta^((3))( vec(R))4 \pi \delta^{(3)}(\vec{R}), where R=|( vec(x))- vec(x)^(')|R=\left|\vec{x}-\vec{x}^{\prime}\right|.
(42.8) We can show that eqn 42.50 transforms as a 4 vector in flat space. Consider field A^(mu)A^{\mu} at point P\mathcal{P} at (0,0,0,0)(0,0,0,0) resulting from light travelling from the current J^(mu)J^{\mu} at points Q\mathcal{Q} at (-t,x,y,z)(-t, x, y, z). We have
where r^(2)=x^(2)+y^(2)+z^(2),| vec(r)|=-tr^{2}=x^{2}+y^{2}+z^{2},|\vec{r}|=-t (and we drop the primed notation for the volume element dVd \mathcal{V} to avoid confusion with the primed coordinates examined in this problem!).
(a) In a primed frame moving at speed vv along the xx-direction relative to the unprimed frame, show that
{:(42.83)r^(')=gamma r(1+v cos theta):}\begin{equation*}
r^{\prime}=\gamma r(1+v \cos \theta) \tag{42.83}
\end{equation*}
where gamma=(1-v)^(-(1)/(2))\gamma=(1-v)^{-\frac{1}{2}} and theta\theta is the angle between vec(r)\vec{r} and the xx axis.
(b) The three-volume dV\mathrm{d} \mathcal{V} captures all of the events that intersect a spherical light wave launched backward in time from P\mathcal{P}. Show that the volume element in the primed frame is given by dV^(')=\mathrm{d} \mathcal{V}^{\prime}=dx^(')dy^(')dz^(')=dx^(')dydz\mathrm{d} x^{\prime} \mathrm{d} y^{\prime} \mathrm{d} z^{\prime}=\mathrm{d} x^{\prime} \mathrm{d} y \mathrm{~d} z, with
(c) In the previous equation, dt\mathrm{d} t gives the difference in times for measuring the extremities of the interval dx\mathrm{d} x. Use this to show that
and, as a result, that the eqn 42.50 transforms as a 4 -vector.
(42.9) Show that the solution to the equations of motion in eqn 42.57 are helices.
(42.10) We define projection tensors with components
where p^(mu)p^{\mu} are the components of a momentum vector. Verify that P_(L)P_{\mathrm{L}} and P_(T)P_{\mathrm{T}} are indeed projection operators by showing that P^(2)=PP^{2}=P.
(42.11) The Lagrangian for electromagnetism in vacuo is, in Minkowski space, L=-(1)/(4)F^(mu nu)F_(mu nu)\mathcal{L}=-\frac{1}{4} F^{\mu \nu} F_{\mu \nu}.
(a) Show that this can be rewritten as
42.12) The canonical energy-momentum tensor from the last question is not symmetric (and so cannot be used for general relativity). However, we can symmetrize it by adding an extra term del_(lambda)X^(lambda mu nu)\partial_{\lambda} X^{\lambda \mu \nu}, where X^(lambda mu nu)=-F^(mu lambda)A^(nu)X^{\lambda \mu \nu}=-F^{\mu \lambda} A^{\nu}. Show that a symmetrized energy-momentum tensor T^(mu nu)=S^(mu nu)+del_(lambda)X^(lambda mu nu)T^{\mu \nu}=S^{\mu \nu}+\partial_{\lambda} X^{\lambda \mu \nu} can be written as
as we had before.
(42.13) We can attempt to formulate a scalar field theory for gravitation using the same method we used for electromagnetism. We follow the approach of Padmanabhan.
Consider the action for the coupling between the scalar field phi(x^(mu))\phi\left(x^{\mu}\right) and matter
(a) Verify that this is a sensible suggestion by restoring factors of cc, and expanding the La- (42.14) We can examine the equation of motion of the field grangian in the non-relativistic limit to O(1//c^(2))O\left(1 / c^{2}\right). (b) Show that the equation of motion for a particle is given by
(c) If phi\phi describes gravitation then the equation of motion must obey the principle of equivalence. By rescaling the field phi\phi, show that action must take
the form
The latter equation implies that the effect of gravitation can be accounted for by a change in the metric eta rarr g\boldsymbol{\eta} \rightarrow \boldsymbol{g} with
{:(42.95)g_(mu nu)=(1+Phi)^(2)eta_(mu nu):}\begin{equation*}
g_{\mu \nu}=(1+\Phi)^{2} \eta_{\mu \nu} \tag{42.95}
\end{equation*}Phi\Phi from the previous question.
where rho\rho is the mass density of particles, leads to Poisson's equation for the gravitational field. (b) How must this Lagrangian be altered to take into account gravitational coupling to energy in addition to mass?
Charge conservation and the Bianchi identity
For I have wings equipped to fly
Up to the high vault of the sky.
Once these are harnessed, your swift mind
Views earth with loathing, far behind.
Boethius (c.480-c.524/6) The Consolation of Philosophy
The conservation of charge, when described using geometry, leads to a mathematical expression known as the Bianchi identity. In electromagnetism, a Bianchi identity generates two of the Maxwell equations: the equations of motion of the electromagnetic fields. In the case of gravitation, the analogous equation (also called a Bianchi identity) encodes conservation of energy, and provides the constraint that tells us how matter couples to the curvature of spacetime. In this chapter, we follow a geometrical path, originally beaten by Misner, Thorne, and Wheeler, from charge conservation to the electromagnetic and then gravitational Bianchi identity. ^(1){ }^{1} The tools we need are mostly taken from Chapter 38 and the arguments in this chapter are mostly geometrical ones involving several computations. As a result, the chapter can be skipped on a first reading if you're happy to take the Bianchi identity on trust and without a physical or geometrical justification.
43.1 Conserving electric charge
Conservation of charge can be written as a divergence grad*J=0\boldsymbol{\nabla} \cdot \boldsymbol{J}=0. In flat spacetime, ^(2){ }^{2} we can write this as the component equation ^(3){ }^{3}
This expresses the fact that charge is locally conserved, so no spontaneous generation of charges can take place in a given 4 -volume V\mathcal{V}. This is important in relativity. Global conservation of charge would mean that no charges can be created or destroyed, but would allow a charge to disappear at one end of the Universe and reappear at some arbitrary point in spacetime. This would allow superluminal transport of charges, and so is forbidden. We must therefore have the stronger condition of local charge conservation enshrined in eqn 43.2 .
Let's unpack ^(4){ }^{4} eqn 43.2 a little more. We are free to integrate the conservation equation over the 4 -volume V\mathcal{V} and then apply Stokes' theorem,
43.1 Conserving electric charge 43.2 Electromagnetic gauge field 469
43.3 Gravitational curvature 471 Chapter summary 475 Exercises 475 ^(1){ }^{1} Luigi Bianchi (1856-1928) rediscovered the identity for the Riemann tensor in 1902. This had (according to Tullio Levi-Civita) been discovered by Gregorio Ricci-Curbastro around 1889, who had supposedly forgotten about it. A contracted version (examined in the exercises) had been derived in 1880 by Aurel Voss (1845-1931). ^(2){ }^{2} We work in flat spacetime for simplicity, but remember that our valid tensor equations also apply in curved space. ^(3){ }^{3} In a more familiar form, we write this as the flat-space continuity equation
Integrating this latter equation with respect to time, we have the result that the total change in the amount of charge in the volume V\mathcal{V} is equal to the total charge that flows through the surface SS. This is local conservation of charge.
to yield
where delV\partial \mathcal{V} is the boundary of the 4 -volume and dSigma_(mu)\mathrm{d} \Sigma_{\mu} is a 3 -volume element. The right-hand side of eqn 43.5 is a statement that conservation means that the integral of currents across a boundary delV\partial \mathcal{V} is zero. Charges enter and exit a given volume, but no new charge is generated in the volume V\mathcal{V}. We shall now see how this fact is accounted for naturally in geometry.
The current of electric charge J\boldsymbol{J} is the source of the electromagnetic field tensor F\boldsymbol{F}. The charges create the fields, the fields tell the charges how to move. The link between the source J\boldsymbol{J} and the field F\boldsymbol{F} have, hard-wired into them, the fact that the source is conserved. This follows from the geometrical law we met in Chapter 38 that the boundary of a boundary is zero. To see this, we need to work with a slightly different version of the conservation law. Rather than grad*J=0\boldsymbol{\nabla} \cdot \boldsymbol{J}=0, we shall use the equivalent version d***J=0\boldsymbol{d} \star \boldsymbol{J}=0. We explain the origin and meaning of this expression below. We start by extracting the dual ***J\star \boldsymbol{J}, which will be the flux in the new version of our conservation law.
Example 43.1
In flat (3+1)-dimensional spacetime, the vector JJ has components J^(mu)=J^{\mu}=(rho,J^(x),J^(y),J^(z))\left(\rho, J^{x}, J^{y}, J^{z}\right) and so we have a dual 3 -form ***J\star \boldsymbol{J}, with components
{:(43.7)(***J)_(012)dt^^dx^^dy=epsi_(3012)J^(z)dt^^dx^^dy:}\begin{equation*}
(\star \boldsymbol{J})_{012} \boldsymbol{d} t \wedge \boldsymbol{d} x \wedge \boldsymbol{d} y=\varepsilon_{3012} J^{z} \boldsymbol{d} t \wedge \boldsymbol{d} x \wedge \boldsymbol{d} y \tag{43.7}
\end{equation*}
^(5){ }^{5} We saw in Chapter 38 that the 3volume element can be written as the 3-form {:[d tilde(Sigma)_(mu)=epsi_(mu∣alpha beta gamma)∣dx^(alpha)^^dx^(beta)^^dx^(gamma)","],[=(1)/(3!)*epsi_(mu alpha beta gamma)dx^(alpha)^^dx^(beta)^^dx^(gamma).]:}\begin{aligned} \mathrm{d} \tilde{\boldsymbol{\Sigma}}_{\mu} & =\varepsilon_{\mu \mid \alpha \beta \gamma} \mid \boldsymbol{d} x^{\alpha} \wedge \boldsymbol{d} x^{\beta} \wedge \boldsymbol{d} x^{\gamma}, \\ & =\frac{1}{3!} \cdot \varepsilon_{\mu \alpha \beta \gamma} \boldsymbol{d} x^{\alpha} \wedge \boldsymbol{d} x^{\beta} \wedge \boldsymbol{d} x^{\gamma} .\end{aligned}
In the same way that ***J\star J represents J^(mu)dSigma_(mu)J^{\mu} \mathrm{d} \Sigma_{\mu} in eqn 43.5 , this latter equation for d***J\boldsymbol{d} \star \boldsymbol{J} expresses the integrand J^(mu)_(,mu)dVJ^{\mu}{ }_{, \mu} \mathrm{d} \mathcal{V}. This is because the 4 -volume element can be replaced with the 4 -form epsi_(0123)dt^^dx^^dy^^dz\varepsilon_{0123} \boldsymbol{d} t \wedge \boldsymbol{d} x \wedge \boldsymbol{d} y \wedge \boldsymbol{d} z and so we can write
We conclude that conservation of charge in eqn 43.5 may be written as
{:(43.12)0=int_(V)d***J=int_(delV)***J:}\begin{equation*}
0=\int_{\mathcal{V}} d \star J=\int_{\partial \mathcal{V}} \star J \tag{43.12}
\end{equation*}
The conservation law is shown schematically in Fig. 43.1.
The point of this section is that the vanishing of these two integrals follows inevitably from (i) Maxwell's equations and (ii) the geometric fact that the boundary of a boundary is zero. Specifically, we can relate the dual of the source ***J\star \boldsymbol{J} to the dual of the electromagnetic field F\boldsymbol{F} using the Maxwell equation ***J=d***F\star \boldsymbol{J}=\boldsymbol{d} \star \boldsymbol{F}. All we need do now is write the conservation law in terms of an integral over d***F\boldsymbol{d} \star \boldsymbol{F} rather than ***J\star \boldsymbol{J} and then apply Stokes theorem
In this final equation we see that, since the boundary of the boundary del delV\partial \partial \mathcal{V} is zero, this integral must itself yield zero (since it's being computed over chain that vanishes). Working backwards then, we can view the conservation of the source field, encoded in d***J=0\boldsymbol{d} \star \boldsymbol{J}=0 as following from the fact that del delV=0.^(6)\partial \partial \mathcal{V}=0 .{ }^{6}
43.2 Electromagnetic gauge field
The argument sketched in the last section can be applied to the electromagnetic gauge field tilde(A)\tilde{\boldsymbol{A}} to extract a Bianchi identity for electromagnetism. We write the electromagnetic 2 -form field in terms of the gauge 1 -form field as F=d tilde(A)\boldsymbol{F}=\boldsymbol{d} \tilde{\boldsymbol{A}}. Consider how the expression dF\boldsymbol{d} \boldsymbol{F} is modified through the application of Stokes' theorem
In words, the facts that (i) the boundary of a boundary is zero, and (ii) that tilde(F)\tilde{\boldsymbol{F}} is exact, mean that it is inevitable ^(7){ }^{7} that d tilde(F)=0\boldsymbol{d} \tilde{\boldsymbol{F}}=0. This is the first statement of the Bianchi identity for electromagnetism.
There are two ways to see the consequences of this, examined in the next examples.
Example 43.2
We examine the term int_(del delV) tilde(A)\int_{\partial \partial \mathcal{V}} \tilde{\boldsymbol{A}}, integrating the 1-form tilde(A)=A_(mu)dx^(mu)\tilde{\boldsymbol{A}}=A_{\mu} \boldsymbol{d} x^{\mu} over the boundary of a cube in flat spacetime. That is to say, we examine
Fig. 43.1 Conservation of charge tells us that the rate of change of the amount of charge in volume V\mathcal{V} is equal to the flux ***J\star J summed over the surface delV\partial \mathcal{V}. ^(6){ }^{6} This explanation can be found in Cartan, although this was something Richard Feynman was not aware of, as shown by his remark in his Lectures of Gravitation that "I do not, offhand, know the geometric significance of the Bianchi identity." The version of the Bianchi identity." The version
we describe here is taken from Misner, we describe here is tak
Thorne, and Wheeler. ^(7){ }^{7} Of course this was also mandated by dd=0(d \boldsymbol{d}=0( via dd tilde(A)=d tilde(F)=0)d \boldsymbol{d} \tilde{\boldsymbol{A}}=\boldsymbol{d} \tilde{\boldsymbol{F}}=0), and we can see here the sense in which this is dual to del del=0\partial \partial=0.
Fig. 43.2 A contribution to the integral over the boundary del delV\partial \partial \mathcal{V}. ^(8){ }^{8} This flat-spacetime equation can be written using comma notation as F_([alpha beta,gamma])=0F_{[\alpha \beta, \gamma]}=0. The comma-goes-tosemicolon rule for curved spacetime then says F_([alpha beta;gamma])=0F_{[\alpha \beta ; \gamma]}=0. ^(9){ }^{9} The rule, from Chapter 32 is
which, since the only non-vanishing component of the bivector is ( e_(x)^^\boldsymbol{e}_{x} \wedgee_(y))^(xy)=1\left.\boldsymbol{e}_{y}\right)^{x y}=1, yields F_(xy)F_{x y}. In fact, it's trivial to see this by simply saying (:( tilde(F)),e_(x)^^:}\left\langle\tilde{\boldsymbol{F}}, \boldsymbol{e}_{x} \wedge\right.{:e_(y):)= tilde(F)(e_(x),e_(y))\left.\boldsymbol{e}_{y}\right\rangle=\tilde{\boldsymbol{F}}\left(\boldsymbol{e}_{x}, \boldsymbol{e}_{y}\right).
as we traverse each edge of the cube. Consider the path shown in Fig. 43.2. Labelling the corners of the square anticlockwise from x,yx, y we find that in traversing a square we obtain a contribution
A_(x)(A)dx+A_(y)(B)dy-A_(x)(C)dx-A_(y)(D)dyA_{x}(\mathcal{A}) \mathrm{d} x+A_{y}(\mathcal{B}) \mathrm{d} y-A_{x}(\mathcal{C}) \mathrm{d} x-A_{y}(\mathcal{D}) \mathrm{d} y
=[A_(x)(A)-A_(x)(C)]dx+[A_(y)(B)-A_(y)(D)]dy=\left[A_{x}(\mathcal{A})-A_{x}(\mathcal{C})\right] \mathrm{d} x+\left[A_{y}(\mathcal{B})-A_{y}(\mathcal{D})\right] \mathrm{d} y
{:(43.17)=((delA_(y))/(del x)-(delA_(x))/(del y))dxdy:}\begin{equation*}
=\left(\frac{\partial A_{y}}{\partial x}-\frac{\partial A_{x}}{\partial y}\right) \mathrm{d} x \mathrm{~d} y \tag{43.17}
\end{equation*}
The contribution from the face shown is therefore given by
{:(43.18)int((delA_(y))/(del x)-(delA_(x))/(del y))dxdy=intF_(xy)dxdy:}\begin{equation*}
\int\left(\frac{\partial A_{y}}{\partial x}-\frac{\partial A_{x}}{\partial y}\right) \mathrm{d} x \mathrm{~d} y=\int F_{x y} \mathrm{~d} x \mathrm{~d} y \tag{43.18}
\end{equation*}
Repeating this procedure and adding to this the contribution from the opposite face, which is traversed in the other direction, we obtain
{:(43.19)int[F_(xy)(z+dz)-F_(xy)(z)]dxdy=int(delF_(xy))/(del z)dxdydz:}\begin{equation*}
\int\left[F_{x y}(z+\mathrm{d} z)-F_{x y}(z)\right] \mathrm{d} x \mathrm{~d} y=\int \frac{\partial F_{x y}}{\partial z} \mathrm{~d} x \mathrm{~d} y \mathrm{~d} z \tag{43.19}
\end{equation*}
Adding all of the contributions, we obtain
{:(43.20)int_(del delV)A_(mu)dx^(mu)=int((delF_(xy))/(del z)+(delF_(zx))/(del y)+(delF_(yz))/(del x))dxdydz:}\begin{equation*}
\int_{\partial \partial \mathcal{V}} A_{\mu} \boldsymbol{d} x^{\mu}=\int\left(\frac{\partial F_{x y}}{\partial z}+\frac{\partial F_{z x}}{\partial y}+\frac{\partial F_{y z}}{\partial x}\right) \mathrm{d} x \mathrm{~d} y \mathrm{~d} z \tag{43.20}
\end{equation*}
The integrand on the right must be zero, since we known that int_(del delV) tilde(A)=0\int_{\partial \partial \mathcal{V}} \tilde{\boldsymbol{A}}=0. The cube can then be reoriented in the (3+1)(3+1) dimensions in which we're working, and the argument repeated for different coordinates.
We conclude that in flat spacetime the components of the Faraday tensor obey the constraining equation ^(8){ }^{8}
This equation, a consequence of the fact that a boundary of a boundary is zero, is the Bianchi identity in component form. They gives us the two Maxwell's equations which, as we saw in the last chapter, are contained in d tilde(F)=0\boldsymbol{d} \tilde{\boldsymbol{F}}=0.
Example 43.3
Another way to approach this same problem, which proves useful when we consider gravity, is to consider I=int_(del nu) tilde(F)=0I=\int_{\partial \nu} \tilde{\boldsymbol{F}}=0, that is, the surface integral of the 2 -form tilde(F)\tilde{\boldsymbol{F}}. We can interpret this integral over a 2 -form in the spirit of Chapter 38 . We represent the surface by the bivector u^^v\boldsymbol{u} \wedge \boldsymbol{v} and, to evaluate the integral, take the inner product of this bivector with the 2 -form bar(F)\overline{\boldsymbol{F}}. Geometrically, this returns a number that tells us how many tubes of F\boldsymbol{F} are cut by the parallelogram u^^v\boldsymbol{u} \wedge \boldsymbol{v}. We have a contribution I_(xy)I_{x y} from the face shown in Fig. 43.2 given by
{:(43.22)I_(xy)=int_(Delta xe_(x)^^Delta ye_(y)) tilde(F)=int(:( tilde(F)),e_(x)^^e_(y):)Delta x Delta y:}\begin{equation*}
I_{x y}=\int_{\Delta x \boldsymbol{e}_{x} \wedge \Delta y \boldsymbol{e}_{y}} \tilde{\boldsymbol{F}}=\int\left\langle\tilde{\boldsymbol{F}}, \boldsymbol{e}_{x} \wedge \boldsymbol{e}_{y}\right\rangle \Delta x \Delta y \tag{43.22}
\end{equation*}
Using the expression for the inner product of a bivector and a 2 -form ^(9){ }^{9} we have
{:(43.23)I_(xy)=intF_(xy)Delta x Delta y:}\begin{equation*}
I_{x y}=\int F_{x y} \Delta x \Delta y \tag{43.23}
\end{equation*}
Repeating the argument for the opposite face, as in the last example, and adding this (exercise) leads to the contribution
{:(43.24)int(delF_(xy))/(del z)Delta x Delta y Delta z:}\begin{equation*}
\int \frac{\partial F_{x y}}{\partial z} \Delta x \Delta y \Delta z \tag{43.24}
\end{equation*}
Repeating and adding the contributions from the other faces gives us the Bianchi identity in component form once again.
43.3 Gravitational curvature
We saw in Chapter 13 that the Bianchi identity for gravitation was an essential part of justifying the Einstein equation. Just like its electromagnetic analogue, the gravitational Bianchi identity follows from considering the consequence of the boundary of a boundary being zero. In the electromagnetic case, we evaluated the effect of this on the 2 -form field F\boldsymbol{F}. For gravitation the analogous quantity is the (1,3) Riemann tensor R\boldsymbol{R}. Instead of working with R\boldsymbol{R} directly, it is easiest to use Cartan's method (Chapter 36) to find the curvature 2-form, which allows us to act on vectors using the operator equation
where the curvature operator is R()=omega^(nu)()oxe_(mu)oxR^(mu)_(nu)\mathcal{R}()=\boldsymbol{\omega}^{\nu}() \otimes \boldsymbol{e}_{\mu} \otimes \mathcal{R}^{\mu}{ }_{\nu} and R^(mu)_(nu)\mathcal{R}^{\mu}{ }_{\nu} is the all-important curvature 2 -form given in terms of the connection 1 -forms ^(10){ }^{10} via
Why focus on R\mathcal{R} ? The answer is that, just as the electromagnetic Bianchi identity can be written as dF=0\boldsymbol{d} \boldsymbol{F}=0, the gravitational version is
The Bianchi identity for gravitation:
{:(43.28)dR=0.:}\begin{equation*}
d \mathcal{R}=0 . \tag{43.28}
\end{equation*}
We prove this important result in three steps, each involving a computation.
Step I: The result of taking an exterior derivative of the curvature 2form is
Example 43.4
Equation 43.29 can be proved as follows. We work in the orthonormal frame, ^(11){ }^{11} so that we have omega_(mu nu)=-omega_(nu mu)\boldsymbol{\omega}_{\mu \nu}=-\boldsymbol{\omega}_{\nu \mu}. Take the derivative of R_(nu)^(mu)\mathcal{R}_{\nu}^{\mu} to find
Step II: We need the result of taking the exterior derivative of a bivector formed from the basis vectors. ^(12){ }^{12} ^(10){ }^{10} We also have the equation that determines the connection 1-forms
^(11){ }^{11} We won't put hats on the indices in this chapter, to prevent clutter. ^(12){ }^{12} We need the rule that if tilde(p)\tilde{\boldsymbol{p}} is a pp-form and tilde(q)\tilde{\boldsymbol{q}} is a 1 -form then
We also have the rule that tensor products of 1 -forms and 1 -vectors commute, allowing us to write e_(mu)oxomega^(nu)=omega^(nu)oxe_(mu)\boldsymbol{e}_{\mu} \otimes \boldsymbol{\omega}^{\nu}=\boldsymbol{\omega}^{\nu} \otimes \boldsymbol{e}_{\mu}.
Fig. 43.3 Boundary of a boundary arguments for electromagnetism (top) and gravity (bottom).
Step III: Working in the orthonormal frame, we define R^(mu nu)=R^(mu)_(alpha)eta^(alpha nu)\mathcal{R}^{\mu \nu}=\mathcal{R}^{\mu}{ }_{\alpha} \eta^{\alpha \nu}. We can then consider an alternative definition of the curvature operator
In the last line, we've used the fact that omega^(mu)_(alpha)^^R^(nu alpha)=omega^(mu alpha)^^R^(nu)_(alpha)\omega^{\mu}{ }_{\alpha} \wedge \mathcal{R}^{\nu \alpha}=\omega^{\mu \alpha} \wedge \mathcal{R}^{\nu}{ }_{\alpha}, and that R^(mu)_(nu)\mathcal{R}^{\mu}{ }_{\nu} is a 2-form and omega^(mu)_(nu)\omega^{\mu}{ }_{\nu} is a 1 -form, so inverting the wedge product does not pick up a sign.
Although the computation was rather lengthy, we have established the Bianchi identity dR=0\boldsymbol{d} \mathcal{R}=0. This allows us to repeat the general argument above (and in Fig. 43.3) involving the boundary of a boundary. Since dR=0\boldsymbol{d} \mathcal{R}=0, we have that
{:(43.37)0=int_(V)dR=int_(del V)R.:}\begin{equation*}
0=\int_{\mathcal{V}} d \mathcal{R}=\int_{\partial V} \mathcal{R} . \tag{43.37}
\end{equation*}
Here, R\mathcal{R} is defined by the double derivative of a vector field w\boldsymbol{w} by writing d^(2)w=R(w)\boldsymbol{d}^{2} \boldsymbol{w}=\mathcal{R}(\boldsymbol{w}). So we're now suggesting that 0=int_(del nu)d^(2)w=int_(del del nu)dw0=\int_{\partial \nu} \boldsymbol{d}^{2} \boldsymbol{w}=\int_{\partial \partial \nu} \boldsymbol{d} \boldsymbol{w}. So we might expect to be able to derive a component version of the Bianchi identity by evaluating the vector-valued 1-form dw=grad w\boldsymbol{d} \boldsymbol{w}=\boldsymbol{\nabla} \boldsymbol{w} round a boundary, and setting it equal to zero. Equivalently, we can evaluate the surface integral of d^(2)w\boldsymbol{d}^{2} \boldsymbol{w} and obtain the same answer. We shall do the latter. ^(14){ }^{14}
To evaluate our integral, we need the result of the inner product of the form d^(2)w\boldsymbol{d}^{2} \boldsymbol{w} and a surface element, as examined in the next example.
Example 43.7
Let's consider (:d tilde(alpha),u^^v:)\langle\boldsymbol{d} \tilde{\boldsymbol{\alpha}}, \boldsymbol{u} \wedge \boldsymbol{v}\rangle, where tilde(alpha)\tilde{\boldsymbol{\alpha}} is a 1-form and u\boldsymbol{u} and v\boldsymbol{v} are vectors. This object counts the number of cells of the 2 -form d tilde(alpha)d \tilde{\alpha} in the parallelogram formed by the bivector u^^v.^(15)\boldsymbol{u} \wedge \boldsymbol{v} .{ }^{15} It can be rewritten as ^(16){ }^{16}
This result generalizes to any tensor-valued 1-form S\boldsymbol{S}, giving
{:(43.40)(:dS","u^^v:)=grad_(u)(:S","v:)-grad_(v)(:S","u:)-(:S","[u","v]:):}\begin{equation*}
\langle d S, u \wedge v\rangle=\nabla_{u}\langle S, v\rangle-\nabla_{v}\langle S, u\rangle-\langle S,[u, v]\rangle \tag{43.40}
\end{equation*}
For the special case at hand, involving the vector-valued 1-form dw\boldsymbol{d} \boldsymbol{w}, we use (:dw,u:)=\langle\boldsymbol{d} \boldsymbol{w}, \boldsymbol{u}\rangle=grad_(u)w\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{w} and find
{:[(:d^(2)w,u^^v:)=grad_(u)grad_(v)w-grad_(v)grad_(u)w-grad_([u,v])w],[(43.41)= hat(R)(u","v)w]:}\begin{align*}
\left\langle d^{2} w, u \wedge v\right\rangle & =\nabla_{u} \nabla_{v} w-\nabla_{v} \nabla_{u} w-\nabla_{[u, v]} w \\
& =\hat{R}(u, v) w \tag{43.41}
\end{align*}
where, as before, hat(R)\hat{\boldsymbol{R}} is the covariant Riemann curvature operator [grad_(u),grad_(v)]-grad_([u,v])\left[\boldsymbol{\nabla}_{\boldsymbol{u}}, \boldsymbol{\nabla}_{\boldsymbol{v}}\right]-\boldsymbol{\nabla}_{[\boldsymbol{u}, \boldsymbol{v}]}.
In the electromagnetic case we examined in Example 43.3, we had (:F,u^^\langle\boldsymbol{F}, \boldsymbol{u} \wedgev:)=F(u,v)\boldsymbol{v}\rangle=\boldsymbol{F}(\boldsymbol{u}, \boldsymbol{v}), allowing us to evaluate the surface integral over a closed surface. The Bianchi identity d tilde(F)=0\boldsymbol{d} \tilde{\boldsymbol{F}}=0 allowed us to set this to zero. In the same way, we have the equation
which allows us to do the surface integral, which we know from dR=0\boldsymbol{d} \mathcal{R}=0, must be zero. This provides a coordinate version of the Bianchi identity. All that's left to do is to rerun the argument in Example 43.3. For simplicity we use Riemann normal coordinates. ^(18){ }^{18} The outward contribution from the the yzy z face of the cube at x+Delta xx+\Delta x is given by
{:(43.43) hat(R)(e_(y),e_(z))w Delta y Delta z:}\begin{equation*}
\hat{\boldsymbol{R}}\left(\boldsymbol{e}_{y}, \boldsymbol{e}_{z}\right) \boldsymbol{w} \Delta y \Delta z \tag{43.43}
\end{equation*}
In coordinates, this is
{:(43.44)R_(beta yz)^(alpha)(" at "x+Delta x)w^(beta)Delta y Delta z:}\begin{equation*}
R_{\beta y z}^{\alpha}(\text { at } x+\Delta x) w^{\beta} \Delta y \Delta z \tag{43.44}
\end{equation*}
^(14){ }^{14} The former approach is followed in Misner, Thorne, and Wheeler. You can see how taking grad w\boldsymbol{\nabla} \boldsymbol{w} around the boundary resembles the process in Example 35.5 for obtaining the components of R\boldsymbol{R}. In this case, each loop comprises the face of the infinitesimal cube, and opposite faces are taken in different directions, leading to the derivatives. The details are left as an exercise. ^(15){ }^{15} As mentioned in Chapter 32, this quantity is equivalent to the inner product (:d tilde(alpha),u^^v:)\langle\boldsymbol{d} \tilde{\boldsymbol{\alpha}}, \boldsymbol{u} \wedge \boldsymbol{v}\rangle. ^(16){ }^{16} If in doubt, think of d tilde(alpha)\boldsymbol{d} \tilde{\alpha} as an object with two slots, which we'll fill with vectors u\boldsymbol{u} and v\boldsymbol{v} in order, via the object u ox v\boldsymbol{u} \otimes \boldsymbol{v}. Putting the vector u\boldsymbol{u} in the first (d) slot gives u*d-=u*grad=grad_(u)\boldsymbol{u} \cdot \boldsymbol{d} \equiv \boldsymbol{u} \cdot \boldsymbol{\nabla}=\boldsymbol{\nabla}_{\boldsymbol{u}}. This was verified in Exercise 34.6. ^(17){ }^{17} We note the recipe of how to take the covariant derivative of a 1 -form like tilde(alpha)\tilde{\boldsymbol{\alpha}} is given by considering the inner product grad_(u)(: tilde(alpha),w:)=(:grad_(u)( tilde(alpha)),w:)+(:( tilde(alpha)),grad_(u)w:)\boldsymbol{\nabla}_{\boldsymbol{u}}\langle\tilde{\boldsymbol{\alpha}}, \boldsymbol{w}\rangle=\left\langle\boldsymbol{\nabla}_{u} \tilde{\boldsymbol{\alpha}}, \boldsymbol{w}\right\rangle+\left\langle\tilde{\boldsymbol{\alpha}}, \boldsymbol{\nabla}_{u} \boldsymbol{w}\right\rangle. ^(18){ }^{18} See Chapter 35 for a description of these. In short, they describe a local inertial frame, so the vanishing connection coefficients at the origin imply that we can treat ordinary derivatives as equivalent to covariant ones in this treatment.
The opposite face gives a similar contribution, except that the sign is reversed and we evaluate at point xx rather than x+Delta xx+\Delta x. The combination yields
{:(43.45)(delR^(alpha)_(beta yz))/(del x)*w^(beta)Delta x Delta y Delta z.:}\begin{equation*}
\frac{\partial R^{\alpha}{ }_{\beta y z}}{\partial x} \cdot w^{\beta} \Delta x \Delta y \Delta z . \tag{43.45}
\end{equation*}
Repeating for all of the other faces, summing and setting equal to zero, we obtain the Bianchi identity for gravitation in component form
where we've reinstated hats to stress that this is true in the local inertial frame. This valid tensor equation is also true in a general coordinate frame on using the, by now familiar, commas go to semicolons rule. ^(19){ }^{19} Finally, we can generalize the cube so that it's described by any set of coordinate axes. This gives a final expression for the Bianchi identity of ^(20){ }^{20} ^(19){ }^{19} That is, we have the curved space time equation R^(alpha)_(beta yz;x)+R^(alpha)_(beta zx;y)+R^(alpha)_(beta xy;z)=0R^{\alpha}{ }_{\beta y z ; x}+R^{\alpha}{ }_{\beta z x ; y}+R^{\alpha}{ }_{\beta x y ; z}=0. ^(20){ }^{20} In tensor notation, we can also write
This is the version of the equation we used in Chapter 13.
This chapter has given us a geometrical sense of where the Bianchi identity comes from, both in electromagnetism and in gravitation. There is one final link that we can make between these theories using the Bianchi identity. It is tempting to identify the Einstein field equation with the Maxwell equations, since these are the key equation of motion for both theories. However, a closer analogy can be made between the Maxwell equations and the Weyl tensor C\boldsymbol{C} that we met in Chapter 35. ^(21){ }^{21} This is examined in the next example.
Example 43.8
Since the Weyl tensor collects those parts of the Riemann tensor that do not appear in the Einstein field equation, it would seem that CC represents that part of the spacetime curvature that is not generated locally by the matter distribution. The components of the Riemann tensor obey the Bianchi identity, which can be rewritten ^(22){ }^{22} in terms of the components of C\boldsymbol{C} as
{:(43.51)C^(mu nu alpha beta)=J^(mu nu alpha)",":}\begin{equation*}
C^{\mu \nu \alpha \beta}=J^{\mu \nu \alpha}, \tag{43.51}
\end{equation*}
where J^(mu nu alpha)=R^(alpha[mu;nu])+(1)/(6)g^(alpha[nu)R^(mu mu])J^{\mu \nu \alpha}=R^{\alpha[\mu ; \nu]}+\frac{1}{6} g^{\alpha[\nu} R^{\mu \mu]}. Since eqn 43.51 looks so much like the Maxwell equations F^(alpha beta)_(i beta)=J^(alpha)F^{\alpha \beta}{ }_{i \beta}=J^{\alpha}, we can think of the Bianchi identity as a field equation for the Weyl tensor, telling us how matter determines curvature at other points in spacetime, just as currents determine electromagnetic field lines in other parts of spacetime.
In the next chapter, we turn to another important aspect of classical field theory that plays a major role in gravitation: the notion of a gauge field.
The Bianchi identity for electromagnetism is a component version of the expression d tilde(F)=0\boldsymbol{d} \tilde{\boldsymbol{F}}=0. The flat-space component equation reads
that was derived in Chapter 13, verify that eqn 43.51 reduces to (a contraction of) the Bianchi identity.
(43.3) Conservation of the source of gravitational curvature is written as grad*T=0\boldsymbol{\nabla} \cdot \boldsymbol{T}=0, but it can also be written as
{:(43.58)d***T=0.:}\begin{equation*}
d \star T=0 . \tag{43.58}
\end{equation*}
Since the Einstein equations are simply G=8pi T\boldsymbol{G}=8 \pi \boldsymbol{T} the dual form is
We shall find expressions for ***T\star \boldsymbol{T} and ***G\star \boldsymbol{G}.
Take the dual of the vector part of the (1,1)(1,1) version
of T=T^(mu)_(nu)e_(mu)oxomega^(nu)\boldsymbol{T}=T^{\mu}{ }_{\nu} \boldsymbol{e}_{\mu} \otimes \boldsymbol{\omega}^{\nu} to show that
where dSigma_(nu)\mathrm{d} \boldsymbol{\Sigma}_{\nu} is the surface 3 -form.
(43.4) Recall the identity from Exercise 34.3, valid in a coordinate frame, that
See Misner, Thorne, and Wheeler for a discussion of the physical content of this equation.
44
44.1 Fibre bundles and gauge invariance strength
Chapter summary 480
Exercises
R Gauge theory is very help ful for understanding weak ful for understanding weak gravitational fields (Chap-
ter 45) and gravitational waves ter 45) and gravitational waves
(Chapter 46). It also forms (Chapter 46). It also forms
the basis of high-dimensional the basis of high-dim
theories (Chapter 48). ^(1){ }^{1} 'Gauge' is an unfortunate term in this context, referring originally to the thickness of metal wires and rails, but we are stuck with it. In field theory, it derives from Hermann Weyl's use of the term in general relativity that we discuss in Chapter 45
Fig. 44.1 (a) A representation of the field rho(x)e^(itheta(x))\rho(x) \mathrm{e}^{\mathrm{i} \theta(x)}, where the angle of each arrow to the vertical represents theta(x)\theta(x) at a particular point. (b) The effect of a global phase transformation theta(x)rarr\theta(x) \rightarrowtheta(x)+alpha\theta(x)+\alpha.
Gauge fields
If you leave a thing alone you leave it to a torrent of change.
G. K. Chesterton (1874-1936) Heretics
Suppose you move through some distance in space and you measure a change in a vector. There are two possible reasons for the change: (i) the intrinsic change of the vector field with position and (ii) the change, with position, of the coordinate system you are using. In Chapter 7, these factors led to our definition of the covariant derivative, in words and coordinates respectively, as
{:[((" Covariant ")/(" derivative "))=((" Change ")/(" in vector "))-((" Change due to ")/(" coordinate system "))],[(44.1)(grad_(alpha)v)^(mu)=(delv^(mu))/(delx^(alpha))+Gamma^(mu)_(alpha beta)v^(beta).]:}\begin{align*}
\binom{\text { Covariant }}{\text { derivative }} & =\binom{\text { Change }}{\text { in vector }}-\binom{\text { Change due to }}{\text { coordinate system }} \\
\left(\boldsymbol{\nabla}_{\alpha} \boldsymbol{v}\right)^{\mu} & =\frac{\partial v^{\mu}}{\partial x^{\alpha}}+\Gamma^{\mu}{ }_{\alpha \beta} v^{\beta} . \tag{44.1}
\end{align*}
This effect of two contributions to the change in a vector with position is closely related to the physics of phases and gauges ^(1){ }^{1} in field theory. In particular, there is an illuminating analogy between the covariant derivative grad_(mu)\nabla_{\mu} in relativity and the covariant derivative D_(mu)D_{\mu} used to describe field theories with local symmetries. In this chapter, we explore this link.
44.1 Fibre bundles and gauge invariance
Fields depend on variables like xx, the position in spacetime. We can think of the field psi\psi stretching out over all time and space, such that there is a value of psi(x)\psi(x) at each point in spacetime. A field theory is described by a Lagrangian L\mathcal{L}. If we perform a transformation of the coordinates that feature in the theory and obtain the same value of L\mathcal{L}, then the equations of motion that one derives from the Lagrangian are unchanged, and we say that the theory has a symmetry. In addition to their dependence on coordinate variables like xx, fields can have other, internal variables such as a phase theta\theta that causes the field carry around a factor e^(itheta)\mathrm{e}^{\mathrm{i} \theta}. An example is a field, psi(x)=rho(x)e^(itheta(x))\psi(x)=\rho(x) \mathrm{e}^{\mathrm{i} \theta(x)} which could be visualized as having an amplitude rho\rho at each point in space, as well as a phase theta\theta, which could be represented in a diagram by an phasor arrow, as shown in Fig. 44.1(a). We can perform transformations on the phase itself, perhaps by making the change theta(x)rarr theta(x)+alpha\theta(x) \rightarrow \theta(x)+\alpha, which amounts to a rotation of all of the arrows by an angle alpha\alpha as shown in Fig. 44.1(b). If
the Lagrangian does not change under this transformation then we say that we have an internal symmetry.
Mathematically speaking, despite being a function of the position in spacetime xx, the phase theta(x)\theta(x) does not itself live in (3+1)-dimensional spacetime. So the change in the phase does not correspond to any change in the spatial coordinates, which is to say that the rotation of the arrows is a rotation in the internal space of the field. To understand these internal variables, we make use of the mathematical notion of a fibre bundle over spacetime. ^(2){ }^{2} We imagine that floating above each point in spacetime is another space called a fibre, described by the internal variables. In the mathematical language, we call the usual (3+1)-dimensional spacetime the base space M\mathcal{M}. The fibre bundle B\mathcal{B} is a structure formed from combining a fibre V\mathcal{V} at each point in M\mathcal{M} (or, in Roger Penrose's words, 'an M\mathcal{M} worth of V\mathcal{V} 's'). The bundle B\mathcal{B} can then be thought of as comprising M\mathcal{M} with the structure floating above M\mathcal{M}, as shown in Fig. 44.2.
Example 44.1
We have met such a structure before when we considered tangent vectors. A vector v\boldsymbol{v} that is tangent to a curve at a point P\mathcal{P} in a spacetime manifold M\mathcal{M}, lives in a vector space known as a tangent space, denoted T_(P)\mathcal{T}_{\mathcal{P}} that can be thought of as floating above the point. The components of the vector can then be thought of as being stored in a fibre (that is, in the tangent space T_(P)\mathcal{T}_{\mathcal{P}} ) above P\mathcal{P}. There is a different tangent space for each point in the spacetime. Combining all of the tangent spaces, we make a tangent bundle TM\mathcal{T} \mathcal{M}.
Another example of bundles in physics is isospin. Many matter fields act as if they have an internal switch that allows us to change the identity of particle excitations at a particular point. This internal variable is isospin and can be treated in terms of fibres.
A slightly different example is Newtonian spacetime, which is formed from a one-dimensional manifold RR that corresponds to points in time. Floating above each point in time is a three-dimensional space that gives the configuration of the system in space at that time. ^(3){ }^{3}
In our example of a field psi\psi with a phase, it is the phase variable theta\theta that is stored in a fibre above each point in spacetime. Since we want the field to be single valued, the angle theta\theta must be the same as theta+2pi\theta+2 \pi, and we can visualize each of the fibres as a unit circle floating above a point in spacetime as shown (for two spatial dimensions) in Fig. 44.3. (This is, of course, equivalent to our picture of arrows in Fig. 44.1, but making the unit circles explicit is helpful for later applications.) The global rotation of the phases by an amount alpha\alpha, simply adds alpha\alpha to the values stored in each of the fibres. If in doubt, think of each unit circle fibre as a knob that can be turned. The global rotation corresponds to twisting all of the knobs by the same angle alpha\alpha.
A very special property of fields is revealed if we examine the transformation of the internal phase variable in detail. Let's confine ourselves to flat space and consider the so-called complex scalar field theory. This theory is built from complex-number-valued fields psi\psi that are de-
Fig. 44.2 A one-dimensional manifold M\mathcal{M} has a fibre V\mathcal{V} floating above each point. The combination of an M\mathcal{M} worth of V\mathcal{V} s, makes the bundle B\mathcal{B}. ^(2){ }^{2} Fibre bundles are described more mathematically in Appendix C. ^(3){ }^{3} This structure is what general relativity replaces with our notion of metrics defined on a manifold.
Fig. 44.3 View from above of a twodimensional space with a circular fibre (or knob) floating above each point.
scribed by the matter-field Lagrangian
^(5){ }^{5} This is of course because psi^(†)rarr\psi^{\dagger} \rightarrowpsi^(†)e^(-ialpha)\psi^{\dagger} \mathrm{e}^{-\mathrm{i} \alpha} and so the angle alpha\alpha cancels in the combinations of the field and its hermitian conjugate in L\mathcal{L}.
where mm is a constant. ^(4){ }^{4} This theory has an internal symmetry known as a global phase symmetry, or global U(1)U(1) symmetry. This is shown if we make the replacement
since then the Lagrangian L\mathcal{L} (and hence the equation of motion) does not change. ^(5){ }^{5} The transformation in eqn 44.3 is a global transformation in that the phase changes by the same amount (i.e. an addition of alpha\alpha ) at every point in spacetime.
We can now ask an interesting question: what if we attempt to change the phase by different amounts at each point in spacetime? This would imply we change the phases theta(x)\theta(x) by an amount alpha(x)\alpha(x), that is, by an amount that depends in some arbitrary (but smoothly varying) manner on position in spacetime. This is known as a local transformation.
The message is that, perhaps unsurprisingly, the theory is not invariant with respect to local phase transformations. However, we can fix things to guarantee local invariance, by adding another field into the mix. To that end we introduce a 1-form field tilde(A)(x)=A_(mu)(x)dx^(mu)\tilde{\boldsymbol{A}}(x)=A_{\mu}(x) \boldsymbol{d} x^{\mu}. This field, whose job is to cancel out the effect of the change in internal variable alpha(x)\alpha(x) with position, is known as a gauge field. This enters into a new covariant field derivative D_(mu)D_{\mu}, defined by
Example 44.3
If psi(x)rarr psi(x)e^(ialpha(x))\psi(x) \rightarrow \psi(x) \mathrm{e}^{\mathrm{i} \alpha(x)}, then del_(mu)psi rarre^(ialpha)(del_(mu)psi)+ipsie^(ialpha)(del_(mu)alpha)\partial_{\mu} \psi \rightarrow \mathrm{e}^{\mathrm{i} \alpha}\left(\partial_{\mu} \psi\right)+\mathrm{i} \psi \mathrm{e}^{\mathrm{i} \alpha}\left(\partial_{\mu} \alpha\right) and so
since now with D_(mu)psi rarre^(ialpha)D_(mu)psiD_{\mu} \psi \rightarrow \mathrm{e}^{\mathrm{i} \alpha} D_{\mu} \psi, the first term is invariant.
The message is that if the phase alpha(x)\alpha(x) is a function of xx then, in order to guarantee local phase invariance, a new 1-form gauge field tilde(A)(x)\tilde{\boldsymbol{A}}(x) with components A_(mu)(x)A_{\mu}(x) is required to build the covariant field derivative. Furthermore, we recognize this field as akin to the electromagnetic gauge field introduced in Chapter 42, where we saw that one of its most intriguing features was that, without changing the Maxwell equations, it could be changed by an arbitrary amount
which known as a gauge transformation. We see from eqn 44.8 that if we identify chi(x)\chi(x) with alpha(x)//q\alpha(x) / q, then the transformation demanded in eqn 44.8 is simply gauge invariance, confirming that A\boldsymbol{A} has the usual properties of a gauge field as defined in electromagnetism.
To summarize, the complex scalar field theory can be made to have a local phase symmetry [that is, an invariance of the Lagrangian under a transformation theta(x)rarr theta(x)+alpha(x)\theta(x) \rightarrow \theta(x)+\alpha(x) ] by introducing a gauge field tilde(A)(x)\tilde{\boldsymbol{A}}(x) to build a covariant derivative with components D_(mu)=del_(mu)+iqA_(mu)D_{\mu}=\partial_{\mu}+\mathrm{i} q A_{\mu}. The theory, made locally phase invariant by replacing del_(mu)rarrD_(mu)\partial_{\mu} \rightarrow D_{\mu} in the Lagrangian, is known as a gauge theory. The gauge field transforms via a gauge transformation A_(mu)rarrA_(mu)-del_(mu)alpha//qA_{\mu} \rightarrow A_{\mu}-\partial_{\mu} \alpha / q. Gauge theories are the basis of particle physics as well as the physics of superconductivity and topological states of matter like the fractional quantum Hall fluid. ^(6){ }^{6}
In terms of our picture of the fibre bundle, the local transformation theta(x)rarr theta(x)+alpha(x)\theta(x) \rightarrow \theta(x)+\alpha(x) corresponds to a shift in the value of the phase stored in the fibre floating above point xx. Since the shifts are different on different fibres, the resulting bundle is often called a strained bundle, in the same way that a strained solid has atoms displaced by different amounts as a function of position. The derivative del_(mu)\partial_{\mu}, connects the different fibres in that it forces us to consider differences between variables stored in different fibres. In order to take account of the strain in the bundle that we have produced, we say that we introduce a strain into the connection via the field tilde(A)(x)\tilde{\boldsymbol{A}}(x). This all sounds vaguely reminiscent of the connection coefficients in geometry. In fact, a further insight into gauge fields is revealed if we compare the covariant derivative from relativity grad_(mu)\nabla_{\mu} with the covariant field derivative D_(mu)D_{\mu}, which is our next subject for discussion. ^(6){ }^{6} In the case we have examined, if we interpret qq as the electric charge, tilde(A)\tilde{\boldsymbol{A}} is the electromagnetic field. ^(7){ }^{7} We read off that, for the geometric version in eqn 44.15, we need some ex tra indices to keep track of which component of the vector we are examining. However, we can see that there is an analogy if we take iqA_(nu)(x)\mathrm{i} q A_{\nu}(x) to be equivalent to Gamma^(mu)_(nu beta)(x)\Gamma^{\mu}{ }_{\nu \beta}(x).
44.2 Parallel transport and field strength
Let's revisit the covariant derivative grad_(mu)\nabla_{\mu} and the parallel transport that it allows from the point of view of fields. Move from xx to x+dxx+\mathrm{d} x and a field psi\psi changes by an amount dpsi\mathrm{d} \psi. The way to make sense of the change is to note that psi(x)\psi(x) and psi(x+dx)\psi(x+\mathrm{d} x) are measured in different coordinate systems. Thus, the apparently innocent equation
that describes the total change in the field in moving through a distance dx\mathrm{d} x, carries information about two coordinate systems.
We want to isolate properties that change due to changes in the internal space of the field as the field is moved around. To do this we make use of parallel transport. In the geometric description of Chapter 7, we made sense of the covariant derivative conceptually as ((" Covariant ")/(" derivative of "v))_(u)=((" total change ")/(" in vector "v))-((" change due to ")/(" coordinate system "))\binom{\text { Covariant }}{\text { derivative of } \boldsymbol{v}}_{\boldsymbol{u}}=\binom{\text { total change }}{\text { in vector } \boldsymbol{v}}-\binom{\text { change due to }}{\text { coordinate system }}.
The condition for the parallel transport of a vector v\boldsymbol{v} along a path whose tangent vector is u\boldsymbol{u}, is that grad_(u)v=0\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{v}=0, which is to say that the change in the vector components (delv^(mu)//delx^(nu))\left(\partial v^{\mu} / \partial x^{\nu}\right) reflect only the change in the coordinate system (which changes by an amount measured by -Gamma^(mu)_(nu alpha)v^(alpha)-\Gamma^{\mu}{ }_{\nu \alpha} v^{\alpha} ).
Example 44.4
We write grad_(u)v=0\boldsymbol{\nabla}_{\boldsymbol{u}} \boldsymbol{v}=0 in components as
The change in the components of the vector field are given by the connection coefficients, which encode how the coordinates change as we move around spacetime
In the same way that geometrical parallel transport allows us access to the change in the vector component deltav^(mu)\delta v^{\mu} due to the change in underlying coordinates, parallel transport also allows us access to the change in the scalar field delta psi\delta \psi due only to the effect of the change in coordinate system on the field's internal coordinates. For the complex scalar field theory described above, we write the change in parallel transport as ^(7){ }^{7}
We therefore arrive at a general description of the covariant derivative as evaluating dpsi\mathrm{d} \psi, the total change in the scalar field on translation minus delta psi\delta \psi, the change due to the effects of the change in coordinate system on the internal variables of the field. In words and symbols
{:[D psi=((" Change ")/(" in field "))-((" Change due to ")/(" coordinate system "))],[=dpsi-delta psi],[(44.17)=dpsi+iqA_(mu)psidx^(mu).]:}\begin{align*}
D \psi & =\binom{\text { Change }}{\text { in field }}-\binom{\text { Change due to }}{\text { coordinate system }} \\
& =\mathrm{d} \psi-\delta \psi \\
& =\mathrm{d} \psi+\mathrm{i} q A_{\mu} \psi \mathrm{d} x^{\mu} . \tag{44.17}
\end{align*}
We then have a general covariant derivative fit for a description of internal variables of
Comparing this to the geometric version, we see how the components of the gauge field tilde(A)\tilde{\boldsymbol{A}} play the role of the connections Gamma^(mu)_(alpha beta)\Gamma^{\mu}{ }_{\alpha \beta}.
Example 44.5
Returning once more to the fibre bundle picture, the idea of parallel transport can be generalized. As usual, we picture the fibres floating above the base space. Instead of parallel transport, we now ask that a quantity found in a fibre that doesn't change
as we move through spacetime, be connected by a horizontal curve as shown in as we move through spacetime, be connected by a horizontal curve as shown in
Fig. 44.4. We therefore have a notion of horizontal transport, expressed mathematically by saying that for horizontal transport D_(mu)psi=0D_{\mu} \psi=0, or del_(mu)psi=-iqA_(mu)\partial_{\mu} \psi=-\mathrm{i} q A_{\mu}.
Example 44.6
One use of parallel transport in Chapter 11 was to parallel transport a vector around a loop. This resulted in a measure of the curvature of spacetime. Now that we have a general prescription for the covariant derivative of a gauge field psi\psi, we will try it out by transporting the field psi\psi around a closed infinitesimal quadrilateral loop ABCDD\mathcal{A B C D} \mathcal{D} and see what happens. Since the field is single valued, the effect of the change in the field dpsi\mathrm{d} \psi does not contribute, and we are be left with the change equivalent to the parallel transport of the field around the loop. Expanding to second order, the change in the field in moving from A\mathcal{A} to B\mathcal{B}, along a line element Deltax^(mu)\Delta x^{\mu} is found by considering
We agree to keep all terms to second order in going around the rest of the loop. The process is then repeated for the line section leading from points B\mathcal{B} to C\mathcal{C}, which has length deltax^(mu)\delta x^{\mu}. Denoting psi(x_(B))\psi\left(x_{\mathcal{B}}\right) as psi_(B)\psi_{\mathcal{B}} we write
longrightarrowM\longrightarrow \mathcal{M}
Fig. 44.4 The bundle B\mathcal{B} floating above the manifold M\mathcal{M}. The analogue of parallel transport is horizontal transport through B\mathcal{B}.
Moving now from points C\mathcal{C} to D\mathcal{D}, assumed to have separation Deltax^(mu)\Delta x^{\mu}, we have
where the commutator [D_(mu),D_(nu)]=D_(mu)D_(nu)-D_(mu)D_(nu)\left[D_{\mu}, D_{\nu}\right]=D_{\mu} D_{\nu}-D_{\mu} D_{\nu}. Notice that the area of the loop is deltax^(mu)Deltax^(nu)\delta x^{\mu} \Delta x^{\nu} and so the change in the field is given by the action of [D_(mu),D_(nu)]\left[D_{\mu}, D_{\nu}\right] on the field, multiplied by the area of the loop.
It is striking and significant that the parallel transport around an infinitesimal loop has resulted in a change determined by the action of a linear operator [D_(mu),D_(nu):}\left[D_{\mu}, D_{\nu}\right. ] on the field psi\psi. This commutator is often given the name the gauge field strength, or
where F_(mu nu)F_{\mu \nu} are the components of the Faraday tensor tilde(F)\tilde{\boldsymbol{F}}.
We see from the last example that the commutator of the covariant derivative gives us access to the tensor tilde(F)\tilde{\boldsymbol{F}} that tells us about the strength and dynamics of the gauge fields tilde(A)\tilde{\boldsymbol{A}}. In the fibre-bundle language, we can say that the [D_(mu),D_(nu)]\left[D_{\mu}, D_{\nu}\right] is a curvature operator that tells us about the local strain in the bundle. This local strain is represented by the components of the curvature tensor F\boldsymbol{F}.
What about the version for curved spacetime, rather than internal space? We saw in Chapter 35 that, when working in a coordinate system, parallel transport of a vector Z\boldsymbol{Z} around an infinitesimal loop also resulted in the action of the commutator of the covariant derivative grad_(mu)\boldsymbol{\nabla}_{\mu}. In direct analogy, we had the coordinate-frame expression
It is clear that the Riemann tensor R\boldsymbol{R} is analogous to the Faraday tensor F\boldsymbol{F} in that it gives access to the local curvature. The difference is that the field strength R\boldsymbol{R} describes the curvature of spacetime, while the field strength iqF\mathrm{i} q \boldsymbol{F} describes the curvature of the fibre bundle containing the internal variables. We summarize the analogy in the table below, where we also link in the terminology of the fibre bundle picture.
General relativity Gauge theory Fibre bundles
coordinate transformation local phase transformation "strained bundle
connection coefficient Gamma^(mu)_(alpha beta)"
gange field A_(mu)
curvature R^(mu)_(nu alpha beta) field strength iq qF_(mu nu) bundle curvature| General relativity | Gauge theory | Fibre bundles |
| :---: | :---: | :---: |
| coordinate transformation | local phase transformation | strained bundle <br> connection coefficient $\Gamma^{\mu}{ }_{\alpha \beta}$ |
| gange field $A_{\mu}$ | | |
| curvature $R^{\mu}{ }_{\nu \alpha \beta}$ | field strength iq $q F_{\mu \nu}$ | bundle curvature |
In this chapter, we have shown one example of a gauge theory: the complex scalar field theory [also known as U(1)U(1) theory]. In this case, the gauge field tilde(A)\tilde{\boldsymbol{A}} is exactly the gauge field that describes electromagnetism. There are many more examples of gauge field theories, of which the complex scalar field theory is the simplest. They all share the structure presented here with the field strength being provided by the commutator of the covariant derivative. In later chapters, we shall encounter gauges again. Most dramatically, we will see in Chapter 48 an attempt to add a dimension into spacetime in order to accommodate an internal parameter.
Chapter summary
Internal variables carried by fields can be thought of as living in a fibre, suspended above a point in spacetime. The fibres are combined to make a fibre bundle.
A field can be made invariant with respect to local changes of phase by introducing a gauge field tilde(A)(x)\tilde{\boldsymbol{A}}(x). This leads to a covariant derivative D_(mu)=del_(mu)+iqA_(mu)D_{\mu}=\partial_{\mu}+\mathrm{i} q A_{\mu}.
Parallel transport of a field around a loop results in a contribution to the field proportional the action of the linear operator [D_(mu),D_(nu)]=iqF_(mu nu)\left[D_{\mu}, D_{\nu}\right]=\mathrm{i} q F_{\mu \nu} on the field. This is analogous to the case of gravitation, where the Riemann tensor R\boldsymbol{R} takes the place of tilde(F)\tilde{\boldsymbol{F}}.
Exercises
(44.1) We saw in Chapter 43 how the Bianchi identity followed from the fact that the sum of curvatureinduced rotations associated with the six faces of an elementary cube is zero. Following the approach of Ryder (1985), we can repeat this for a gauge field with field strength [D_(mu),D_(nu)]=-igG_(mu nu)\left[D_{\mu}, D_{\nu}\right]=-\mathrm{i} g G_{\mu \nu}.
Fig. 44.5 A path around the faces of a cube.
Referring to Fig: 44.5, the path around the cube is (ABCDAPSRQPA)(\mathcal{A B C D} \mathcal{A P S R \mathcal { Q P A } )} +(ADSPABQRCBA)+(\mathcal{A D S P A B Q R C B A}) +(APQBADCRSA)+(\mathcal{A P Q B A D C R S \mathcal { A } )}
We will evaluate the effect of traversing the first of these paths.
(a) Find the effect on a field psi\psi of traversing the path ABCDA\mathcal{A B C D} \mathcal{A}.
(b) Next, compute the effect of traversing the path AP\mathcal{A P}.
(c) Now the circuit PSRQP\mathcal{P S R} \mathcal{Q P}.
(d) Then the return path PA\mathcal{P} \mathcal{A}.
(e) Finally, combine all of the paths together to
obtain the result to leading order in the interaction strength gg.
(f) Show that if the procedure is repeated around each of the three paths, the result is that the field changes by a factor 1-ig DeltaV^(rho mu nu)(D_(rho)G_(mu nu)+D_(mu)G_(nu rho)+D_(nu)G_(rho mu)),(44.26)1-\mathrm{i} g \Delta V^{\rho \mu \nu}\left(D_{\rho} G_{\mu \nu}+D_{\mu} G_{\nu \rho}+D_{\nu} G_{\rho \mu}\right),(44.26)
where V^(rho mu nu)V^{\rho \mu \nu} is the volume of the cube.
(g) Show that the previous expression is consistent with the Jacobi identity
(44.2) Compute the energy-momentum tensor for complex scalar field theory in flat space.
Weak gravitational fields
We must touch his weaknesses with a delicate hand. There are some faults so nearly allied to excellence, that we scarce weed out the fault without eradicating the virtue.
Oliver Goldsmith (1728-1774) The Good-Natured Man
We have previously met applications of gravitation in circumstances in which gravity is at its strongest. This is exemplified by the physics of the black hole, where the extreme gravitational effects lead to a singularity. In this chapter, we discuss the opposite limit: weak gravitational fields. The subject is particularly important since this is the limit we experience most often in our interactions with Nature. Here we examine how best to treat gravitation in the limit that its effects are weak, and we shall build an approximate field theory that is notable in that it hosts wave-like excitations. As a warm up, we start by examining how the Einstein equations map onto Newton's theory of gravity in the limit of weak fields. ^(1){ }^{1}
45.1 The Newtonian limit
The Newtonian limit is the limit of velocities which are small compared to cc (as measured by a particular ^(2){ }^{2} observer). This corresponds to a velocity vector u\boldsymbol{u}, with components (u^(0),u^(1),u^(2),u^(3))\left(u^{0}, u^{1}, u^{2}, u^{3}\right), for which the timelike component u^(0)u^{0} much larger than the spacelike components u^(i)u^{i}. In other words, slow-moving objects have velocities with components written approximately as (1, vec(u))(1, \vec{u}) with ^(3)u^(i)≪1^{3} u^{i} \ll 1. In order that gravitational effects do not cause masses to attain high velocities, this limit also implies that the gravitational field is limited in strength. Since it is mass-energy that causes spacetime to curve, a limitation of the strength of gravitational curvature constrains the energy-momentum tensor. Specifically, we demand that in the Newtonian limit, all stresses T^(ij)T^{i j} must be small compared to the density of mass energy T^(00)T^{00}, so that ^(4){ }^{4}
This provides a working definition of what we mean by weak gravitation. In the weak-field limit, spacetime is almost flat. Perfectly flat spacetime is described by the Minkowski metric eta\boldsymbol{\eta} and so the weak field causes a small perturbation to this underlying geometry. The deviation from the near-flatness is encoded in a small additive correction to
45.1 The Newtonian limit 485 45.2 Linearized theory of gravitation 487 45.3 Exploiting gauges 488 Chapter summary 491 Exercises 492 ^(1){ }^{1} Some of the arguments presented in this chapter were trailed back in Section 14.1. ^(2){ }^{2} Not all observers will agree that a particular velocity is in the Newtonian limit. ^(3){ }^{3} That is to say, of course, that u^(i)≪u^{i} \ll c. Our choice of units obscures the identification of when relativistic effects should be expected to be large. The relevant factor for velocities is v^(2)//c^(2)v^{2} / c^{2}, which approaches unity in the relativiswhich approaches unity in the relativis-
tic limit. Using some dimensional analysis we can replace v^(2)v^{2} with GM//rG M / r, which has the same dimensions, and so we conclude that when the factor
is of order 1, relativistic effects are important. When epsi≪1\varepsilon \ll 1, we are in the Newtonian limit. ^(4){ }^{4} See Section 14.1 for further justification of this point. ^(5){ }^{5} This allows gravitation to be treated in the same way as the other classi cal field theories, such as electromagnetism, using the tools from Chapter 40 .
the Minkowski metric, such that the components of the full metric g\boldsymbol{g} are written as
with |h_(alpha beta)|≪1\left|h_{\alpha \beta}\right| \ll 1. The small additional components h_(alpha beta)h_{\alpha \beta} will act as a symmetric tensor field. To see this, consider a Lorentz transformation of coordinates x^(mu)=Lambda^(mu)_(nu^('))x^(nu^('))x^{\mu}=\Lambda^{\mu}{ }_{\nu^{\prime}} x^{\nu^{\prime}}. Equation 45.3 is altered when we try to transform the components of g\boldsymbol{g} from a frame with primed indices into a frame with unprimed indices using the Lorentz transformations. We have
It is convenient, therefore, to conceptualize weak gravitational fields as existing in a flat spacetime with Minkowski metric eta\boldsymbol{\eta}, but filled with a symmetric tensor field ^(5)h(x){ }^{5} \boldsymbol{h}(x).
It should be stressed though that this picture is a fiction! It is not possible within general relativity to exactly describe a curved spacetime as flat Minkowski space with an additional field defined on it, hence the reason for Einstein's geometrical theory in the first place. So we should keep in mind that the spacetime is actually curved and it is only an approximation to treat it as flat. Indeed, h\boldsymbol{h} only looks like a tensor when we assume flat spacetime, and if we were to be forced to consider coordinate transformations other than the Lorentz transformations, then h\boldsymbol{h} would not necessarily transform as a tensor. Nonetheless, this picture of a flat spacetime with a field h(x)\boldsymbol{h}(x) overlaid is a very useful one that allows us to compute the effects of gravitation using h(x)\boldsymbol{h}(x), while raising and lowering indices using the components of the Minkowski metric. In addition, in order to match the boundary condition that the gravitation field dies away at infinity [i.e. the potential Phi(r rarr oo)=0\Phi(r \rightarrow \infty)=0 ], we will also demand that h(r rarr oo)=0h(r \rightarrow \infty)=0.
We have said several times how unique gravitation is in its treatment of the metric field. The weak-field approximation has the great advantage that it gives us an approximate theory that closely resembles all other classical theories of fields (in flat spacetime) in which classical fields fill all of spacetime. So now, instead of gravitation determining the metric field, gravitation is described by the field h\boldsymbol{h} in flat spacetime, and it is the dynamics of this h\boldsymbol{h}-field that we seek to describe. This is a really useful simplification. The next step in our argument is to find a linear version of the Einstein equation that involves the h\boldsymbol{h}-field, rather than the Einstein tensor G\boldsymbol{G}, and this linearized approximation to general relativity, which is valid in the weak-field limit, will also set us up for exploring the properties of gravitational waves.
45.2 Linearized theory of gravitation
General relativity is a nonlinear field theory. This may be traced back to the products of the connection coefficients that feature in the expression for the Riemann tensor. ^(6){ }^{6} In the limit of small interaction strength, many nonlinear field theories can be linearized by dropping nonlinear terms rendered small by the small size of the interaction strength. This is what we shall do here.
In the last section, we saw that the components of the metric may be written as g_(alpha beta)=eta_(alpha beta)+h_(alpha beta)g_{\alpha \beta}=\eta_{\alpha \beta}+h_{\alpha \beta}, where we agree to raise and lower indices with eta^(mu nu)\eta^{\mu \nu} and eta_(mu nu)\eta_{\mu \nu}, respectively. With this in mind, the connection coefficients are given by
where we have extended the comma notation to take in up indices as well as down ones. We'll use this expression for the connections to compute the components of the Riemann tensor. However, since we are seeking a linearized theory, we ignore terms that are products of the Gamma\Gamma s, leaving us with a simplified expression for the components of R\boldsymbol{R} that reads
where, in the final term we have defined the trace h=h^(alpha)_(alpha)=eta^(alpha beta)h_(alpha beta)h=h^{\alpha}{ }_{\alpha}=\eta^{\alpha \beta} h_{\alpha \beta}.
The tools from the last example allow us to work out the Einstein equation G_(mu nu)=R_(mu nu)-(1)/(2)g_(mu nu)R=8piT_(mu nu)G_{\mu \nu}=R_{\mu \nu}-\frac{1}{2} g_{\mu \nu} R=8 \pi T_{\mu \nu}. We obtain the rather unmemorable expression
This means that G_(mu nu)= bar(R)_(mu nu)G_{\mu \nu}=\bar{R}_{\mu \nu}. Note that bar(bar(h))_(mu nu)=h_(mu nu)\overline{\bar{h}}_{\mu \nu}=h_{\mu \nu}, so that
^(8){ }^{8} Recall the situation in electromag netism where we are free to pick the components of the electromagnetic field 1-form tilde(A)(x)\tilde{\boldsymbol{A}}(x). Specifically, we can always make the transformation A_(mu)rarrA_{\mu} \rightarrowA_(mu)-chi,muA_{\mu}-\chi, \mu without making any changes to Maxwell's equations. We can therefore pick the function chi\chi in such a way as to simplify the problem at hand. ^(9){ }^{9} Unlike the transformation in Chapter 44: theta(x)rarr theta(x)+alpha(x)\theta(x) \rightarrow \theta(x)+\alpha(x), where we were changing an internal variable at each point in spacetime, we must label the underlying points in the manifold with labels like P\mathcal{P}, since the spacetime coordinates themselves are being trans formed. The choice of length scale in the coordinates implied by the choice of xi^(mu)(P)\xi^{\mu}(\mathcal{P}) in eqn 45.17 was called a 'gauge by Hermann Weyl, where the use of the term is motivated by the 'gauge' of wires and rails.
Using the trace-reversed fields we find the Einstein tensor becomes
We have succeeded in writing down a linearized version of the Einstein equation in terms of the h\boldsymbol{h}-field. It has the disadvantage of looking rather complicated. However, a drastic simplification is possible, by exploiting the notion of gauge freedom discussed in Chapters 42 and 44.
45.3 Exploiting gauges
In simplifying the Einstein equation, it would be very helpful to us if we were allowed to write
since this would kill off several terms in eqn 45.15. It turns out that this is a condition we can indeed fix. The reason for this stems from the gauge invariance properties of the Einstein equation. ^(8){ }^{8}
In order to see the gauge transformation properties, we allow the coordinates to transform via an infinitesimal coordinate transformation ^(9){ }^{9}
where P\mathcal{P} is some point and the shifts xi^(mu)(P)\xi^{\mu}(\mathcal{P}) are small enough that the components of h\boldsymbol{h} in the primed frame obey |h_(mu^(')nu^('))|≪1\left|h_{\mu^{\prime} \nu^{\prime}}\right| \ll 1. This tiny transformation is designed to make only tiny changes to the smoothly varying fields that exist in the almost-flat spacetime. However, there is one field where they cannot be ignored: the metric. This is because it is exactly the very small variations in the metric that describe gravitation in this limit. Recall that metric components have the transformation law
Substitute from eqns 45.3 and 45.17 , and we see that g_(rho^(')sigma^('))(x^(alpha^('))=a^(alpha))=eta_(rho^(')sigma^('))+h_(rho^(')sigma^('))(x^(alpha^('))=a^(alpha))-xi_(rho^('),sigma^('))-xi_(sigma^('),rho^('))+g_{\rho^{\prime} \sigma^{\prime}}\left(x^{\alpha^{\prime}}=a^{\alpha}\right)=\eta_{\rho^{\prime} \sigma^{\prime}}+h_{\rho^{\prime} \sigma^{\prime}}\left(x^{\alpha^{\prime}}=a^{\alpha}\right)-\xi_{\rho^{\prime}, \sigma^{\prime}}-\xi_{\sigma^{\prime}, \rho^{\prime}}+ (corrections).
Example 45.2
Let's check this last expression. We have
where we ignore higher order terms in the derivatives of the transformation components xi^(mu)\xi^{\mu}. We then have a right-hand side of eqn 45.18 given by
All other fields in the linearized theory are not changed appreciably by our infinitesimal coordinate transformations. We sketch this property for the Riemann tensor in the next example.
Example 45.3
We need to show R_(mu nu alpha beta)^("new ")=R_(mu nu alpha beta)^("old ")R_{\mu \nu \alpha \beta}^{\text {new }}=R_{\mu \nu \alpha \beta}^{\text {old }}. Under the gauge transformation we have
where here the bracket notation for antisymmetrization has been used. The added terms are all zero (since derivatives commute) and the Riemann tensor is left unchanged.
We are therefore free to make infinitesimal coordinate transformations without changing the physics. Using the analogy from Chapter 44, we note that our choice of coordinates in this context is the same as a choice of gauge. We therefore choose coordinates such that
This is analogous to Lorenz gauge in electromagnetism, which is the condition A^(mu)_(,mu)=0A^{\mu}{ }_{, \mu}=0. This gauge cause the Einstein equation to reduce to the far friendlier expression
where bar(h)_(mu nu)=h_(mu nu)-(1)/(2)eta_(mu nu)h\bar{h}_{\mu \nu}=h_{\mu \nu}-\frac{1}{2} \eta_{\mu \nu} h.
We have therefore derived a linear field theory of gravitation. It has the form of a wave equation ^(12){ }^{12} in the absence of the source term 16 piT_(mu nu)16 \pi T_{\mu \nu}. ^(10){ }^{10} We can compare the situation in gravitation, where a coordinate change x^(mu)(P)rarrx^(mu)(P)+xi^(mu)(P)x^{\mu}(\mathcal{P}) \rightarrow x^{\mu}(\mathcal{P})+\xi^{\mu}(\mathcal{P}) mandates a change in the metric field components
with that of electromagnetism, where a change in an internal variable theta(P)rarr\theta(\mathcal{P}) \rightarrowtheta(P)+alpha(P)\theta(\mathcal{P})+\alpha(\mathcal{P}) mandates a change in the electromagnetic field components
^(12){ }^{12} Reminder: A wave equation has the form del^(2)phi=-(del^(2)phi)/(delt^(2))+ vec(grad)^(2)phi=0\partial^{2} \phi=-\frac{\partial^{2} \phi}{\partial t^{2}}+\vec{\nabla}^{2} \phi=0. Its solutions are plane waves of the form phi=Ae^(-i(omega t- vec(k)* vec(x)))\phi=A \mathrm{e}^{-\mathrm{i}(\omega t-\vec{k} \cdot \vec{x})}.
The analogous expression in electromagnetism is -del^(2)A^(mu)=J^(mu)-\partial^{2} A^{\mu}=J^{\mu}. The (retarded) potential in electromagnetism that solves this latter wave equation is
where the factor of GG has been restored. We shall examine this further in the next two chapters.
Example 45.4
Let's check out eqn 45.31 in the limit of a non-relativistic stationary source (i.e. a static distribution of mass). This problem has no time dependence and so eqn 45.31 then reduces to
{:(45.32) bar(h)_(mu nu)( vec(x))=4G intd^(3)y(T_(mu nu)(( vec(y))))/(|( vec(x))-( vec(y))|):}\begin{equation*}
\bar{h}_{\mu \nu}(\vec{x})=4 G \int \mathrm{~d}^{3} y \frac{T_{\mu \nu}(\vec{y})}{|\vec{x}-\vec{y}|} \tag{45.32}
\end{equation*}
and in the weak-field limit the only significant term in T_(mu nu)( vec(y))T_{\mu \nu}(\vec{y}) is T_(00)( vec(y))~~rho( vec(y))T_{00}(\vec{y}) \approx \rho(\vec{y}), where rho( vec(y))\rho(\vec{y}) is the density distribution of the source. Hence, we only have to worry about
{:(45.33) bar(h)_(00)( vec(x))=4G intd^(3)y(rho(( vec(y))))/(|( vec(x))-( vec(y))|).:}\begin{equation*}
\bar{h}_{00}(\vec{x})=4 G \int \mathrm{~d}^{3} y \frac{\rho(\vec{y})}{|\vec{x}-\vec{y}|} . \tag{45.33}
\end{equation*}
However, this equation reminds us of the equation for the gravitational scalar potential Phi( vec(x))\Phi(\vec{x}) which is
which is, of course, the weak-field metric (eqn 5.22, derived in Section 14.1).
Example 45.5
The energy-momentum tensor for dust takes the form T^(mu nu)=rhou^(mu)u^(nu)T^{\mu \nu}=\rho u^{\mu} u^{\nu}, and so we only keep the dominant term T^(00)=rhoT^{00}=\rho in the weak-field limit. The next-most important ^(13){ }^{13} Recall that, inserting the factors of cc, that T^(00)=rhoc^(2)T^{00}=\rho c^{2} and T^(0i)=rho cu^(i)T^{0 i}=\rho c u^{i}. terms are T^(0i)=rhou^(i)T^{0 i}=\rho u^{i} and these terms are smaller than T^(00)T^{00} by a factor ^(13)v//c{ }^{13} v / c. The terms T^(ij)=rhou^(i)u^(j)T^{i j}=\rho u^{i} u^{j} are smaller than T^(00)T^{00} by a factor of (v//c)^(2)(v / c)^{2} and so can be safely ignored, but now let's consider the effect of the T^(0i)T^{0 i} terms. We can now relate bar(h)_(0i)\bar{h}_{0 i} to a gravitational analogue of the vector potential vec(A)_(g)\vec{A}_{\mathrm{g}} by writing
where J^(i)( vec(y))=rho( vec(y))u^(i)( vec(y))J^{i}(\vec{y})=\rho(\vec{y}) u^{i}(\vec{y}) is the current density, in which case we deduce that (on lowering indices, and using h_(0i)= bar(h)_(0i)h_{0 i}=\bar{h}_{0 i} )
Now, using our linearized equation of gravitation (eqn 45.29), and ignoring the smaller time-dependent term, we have
{:(45.40)grad^(2)Phi=4pi G rhoquad" and "quadgrad^(2) vec(A)_(g)=4pi G vec(J):}\begin{equation*}
\nabla^{2} \Phi=4 \pi G \rho \quad \text { and } \quad \nabla^{2} \vec{A}_{\mathrm{g}}=4 \pi G \vec{J} \tag{45.40}
\end{equation*}
which are the Poisson's equations for the gravitational scalar and vector potential. These are, of course, reminiscent of the analogous equations for electromagnetism, namely
{:(45.41)grad^(2)V=-(rho)/(epsilon_(0))quad" and "quadgrad^(2) vec(A)=-mu_(0) vec(J):}\begin{equation*}
\nabla^{2} V=-\frac{\rho}{\epsilon_{0}} \quad \text { and } \quad \nabla^{2} \vec{A}=-\mu_{0} \vec{J} \tag{45.41}
\end{equation*}
where the minus signs reflect the fact that like charges repel in electromagnetism but masses attract in gravity. Pursuing this analogy, we can define the gravitoelectric field vec(E)_(g)=- vec(grad)Phi\vec{E}_{\mathrm{g}}=-\vec{\nabla} \Phi and the gravitomagnetic field vec(B)_(g)= vec(grad)xx vec(A)_(g)\vec{B}_{\mathrm{g}}=\vec{\nabla} \times \vec{A}_{\mathrm{g}} and, by analogy with electromagnetism, these will satisfy the gravitational Maxwell equations
{:(45.42) vec(grad)* vec(E)_(g)=-4pi G rho","quad vec(grad)* vec(B)_(g)=0","quad vec(grad)xx vec(E)_(g)=0","quad vec(grad)xx vec(B)_(g)=-4pi G vec(J):}\begin{equation*}
\vec{\nabla} \cdot \vec{E}_{\mathrm{g}}=-4 \pi G \rho, \quad \vec{\nabla} \cdot \vec{B}_{\mathrm{g}}=0, \quad \vec{\nabla} \times \vec{E}_{\mathrm{g}}=0, \quad \vec{\nabla} \times \vec{B}_{\mathrm{g}}=-4 \pi G \vec{J} \tag{45.42}
\end{equation*}
These are analogous to the electromagnetic Maxwell equations ignoring any timedependence. This analogy between gravity and electromagnetism can be taken further and it can be shown that, just as a charge current in electromagnetism creates a magnetic field that exerts at vec(v)xx vec(B)\vec{v} \times \vec{B} force, so a current of mass creates a gravitomagnetic field that exerts a vec(v)xx vec(B)_(g)\vec{v} \times \vec{B}_{\mathrm{g}} force. This is the basis of the Lense-Thirring effect in which a spinning mass (in which you have rotational currents of mass) will drag objects around with the rotation. ^(14){ }^{14}
Weak-field gravitation is a very useful limit for a lot of the Universe we encounter for which gravity is a relatively small perturbation. After all, most of the Universe is pretty empty of matter. In the next chapter, we will solve eqn 45.29 in matter-free spacetime, so that the energymomentum tensor T=0\boldsymbol{T}=0 and we then have -del^(2) bar(h)_(mu nu)=((del^(2))/(delt^(2))- vec(grad)^(2)) bar(h)_(mu nu)=-\partial^{2} \bar{h}_{\mu \nu}=\left(\frac{\partial^{2}}{\partial t^{2}}-\vec{\nabla}^{2}\right) \bar{h}_{\mu \nu}= 0 . This is a wave equation and predicts the existence of gravitational waves!
Chapter summary
Weak-field gravitation can be approximately described using a Minkowski metric with a small additive correction given by the weak-field metric h(x)\boldsymbol{h}(x).
In terms of the weak-field metric, the Einstein equation is
where bar(h)_(mu nu)=h_(mu nu)-(1)/(2)eta_(mu nu)h\bar{h}_{\mu \nu}=h_{\mu \nu}-\frac{1}{2} \eta_{\mu \nu} h. This is a wave equation in the absence of sources of mass-energy.
Under an infinitesimal coordinate change x^(mu^('))rarrx^(mu)+xi^(mu)x^{\mu^{\prime}} \rightarrow x^{\mu}+\xi^{\mu}, we have
where dSigma\mathrm{d} \Sigma is an element of 3 -volume and the energy-momentum tensor is evaluated at position vec(x)^(')\vec{x}^{\prime}.
(45.3) For a low-velocity perfect fluid in the weak-field limit, show the following:
(a) The gravitoelectric energy density, defined as rho_(g)=2T_(tt)-eta_(tt)T\rho_{\mathrm{g}}=2 T_{t t}-\eta_{t t} T, is given by
{:(45.49)rho_(g)~~rho+3p:}\begin{equation*}
\rho_{\mathrm{g}} \approx \rho+3 p \tag{45.49}
\end{equation*}
(b) The gravitometric current density, Pi_(i)=\Pi_{i}=-T_(ti)+(1)/(2)eta_(ti)T-T_{t i}+\frac{1}{2} \eta_{t i} T, is given by
where u_(i)u_{i} are velocity 1 -form components.
(c) The curvature energy density, rho_(c)=2T_(ii)-\rho_{c}=2 T_{i i}-eta_(ii)T\eta_{i i} T, is given by
This curvature energy density is the quantity in eqn 45.48 that causes the diagonal, spatial parts of h\boldsymbol{h} to be non-zero. These are the parts of g\boldsymbol{g} that give the deviations from a flat-spacetime metric.
(45.4) (a) Show that a particle moving in the weak, stationary field obeys the geodesic equation
where rho_(g)\rho_{\mathrm{g}} was defined in the previous question.
(45.5) (a) Show that in a Universe filled with only vacuum energy, the gravitoelectric energy density is negative, making gravity repulsive.
(b) Show further that in terms of the potential Phi\Phi, we obtain an equation vec(grad)^(2)Phi=-Lambda\vec{\nabla}^{2} \Phi=-\Lambda, which is solved by a potential Phi=alphar^(2)\Phi=\alpha r^{2}, where alpha\alpha is a constant that should be determined.
(45.6) We shall demonstrate the precession of angular momentum in a gravitational field. We follow the approach of Ryder (2009).
In the weak-field limit, the geometry outside a rotating object leads to a metric with components
where J^(i)J^{i} are components of the angular momentum of the source, which has mass MM.
(a) Assuming that, for application to the Earth, the quantities phi^(2)\phi^{2} and phizeta_(i)\phi \zeta_{i} and zeta_(i)zeta_(j)\zeta_{i} \zeta_{j} can be ignored, show that the components of g^(mu nu)g^{\mu \nu} are the same as
those of g_(mu nu)g_{\mu \nu} with the sign in front of phi\phi reversed.
(b) Show that, to leading order, the connection coefficients are
(45.7) Define a (3+1)(3+1)-dimensional spin by a 1 -form with components S_(mu)S_{\mu} which is orthogonal to the velocity.
(a) Show that S_(0)=-(dx^(j)//dt)S_(i)S_{0}=-\left(\mathrm{d} x^{j} / \mathrm{d} t\right) S_{i}.
(b) If such a spin is parallel transported along a geodesic, show further that
is a constant when parallel transported.
(45.9) To see the consequences of Exercise 45.7, we define a slightly updated version of the spin vector through the 3 -vector equation
Working to order vec(v)^(2) vec(S)^(2)\vec{v}^{2} \vec{S}^{2} and phi vec(S)^(2)\phi \vec{S}^{2}, and using the fact that the spin 4 -vector is parallel transported, show that
(a) phi vec(S)^(2)=phi vec(Sigma)^(2)\phi \vec{S}^{2}=\phi \vec{\Sigma}^{2}.
(b) ( vec(v)* vec(S))^(2)=( vec(v)* vec(Sigma))^(2)(\vec{v} \cdot \vec{S})^{2}=(\vec{v} \cdot \vec{\Sigma})^{2},
and hence that
(c) vec(Sigma)^(2)=\vec{\Sigma}^{2}= const.
This new 3-vector has a constant length and so is suitable for demonstrating precession.
(d) Explain why
(e) Use the results from part (d) to show that, if we ignore terms of order v^(2)d vec(S)//dtv^{2} \mathrm{~d} \vec{S} / \mathrm{d} t or phid vec(S)//dt\phi \mathrm{d} \vec{S} / \mathrm{d} t, we have (d( vec(Sigma)))/(dt)=(d( vec(S)))/((d)t)+( vec(grad)phi* vec(v)) vec(S)+(1)/(2)( vec(v)* vec(S)) vec(grad)phi+(1)/(2)( vec(grad)phi* vec(S)) vec(v)\frac{\mathrm{d} \vec{\Sigma}}{\mathrm{d} t}=\frac{\mathrm{d} \vec{S}}{\mathrm{~d} t}+(\vec{\nabla} \phi \cdot \vec{v}) \vec{S}+\frac{1}{2}(\vec{v} \cdot \vec{S}) \vec{\nabla} \phi+\frac{1}{2}(\vec{\nabla} \phi \cdot \vec{S}) \vec{v}. (45.64)
(f) Using the result of Exercise 45.7, show finally that
Hint: It is permissible to swap vec(S)\vec{S} for vec(Sigma)\vec{\Sigma} in the final step to obtain the quoted equation of motion. This is the equation of motion for the precession of a vector vec(Sigma)\vec{\Sigma} around the direction of vec(Omega)\vec{\Omega}, and implies that a gyroscope in orbit around the Earth will precess.
(g) Using phi=-M//r\phi=-M / r and zeta_(i)=(2//r^(3))epsi_(ikm)x^(k)J^(m)\zeta_{i}=\left(2 / r^{3}\right) \varepsilon_{i k m} x^{k} J^{m}, show that
The first term in this equation depends on the angular momentum of the Earth vec(J)\vec{J} and gives rise to the Lense-Thirring effect. The second term gives the de Sitter-Fokker effect, usually called geodetic motion. Geodetic motion is a precession effect caused by the motion of the gyroscope around a geodesic and implies that on completing an orbit of a non-rotating mass, a gyroscope will precess. The Lense-Thirring precession is more remarkable as it arises purely from the rotation of the Earth. Notice how the field vec(Omega)\vec{\Omega} causing the precession has the form of an electromagnetic dipole.